<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://www.edegan.com/mediawiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ed</id>
	<title>edegan.com - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="http://www.edegan.com/mediawiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ed"/>
	<link rel="alternate" type="text/html" href="http://www.edegan.com/wiki/Special:Contributions/Ed"/>
	<updated>2026-05-24T23:37:19Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.34.2</generator>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=Ed_Egan&amp;diff=48677</id>
		<title>Ed Egan</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=Ed_Egan&amp;diff=48677"/>
		<updated>2025-05-01T15:13:44Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Team Member&lt;br /&gt;
|Has name=Ed Egan&lt;br /&gt;
|Has headshot=EdEganHeadshot1.jpg&lt;br /&gt;
|Has team position=Faculty&lt;br /&gt;
|Has team status=Active&lt;br /&gt;
|Has or doing degree=PhD&lt;br /&gt;
|Has academic major=Business Admin.&lt;br /&gt;
|Has email=ed@edegan.com&lt;br /&gt;
|Has skype=EdwardJEgan&lt;br /&gt;
|Went to school=U.C. Berkeley&lt;br /&gt;
|Has graduating class=2012&lt;br /&gt;
|Has job title=Director&lt;br /&gt;
|Has URL=www.edegan.com&lt;br /&gt;
|Has team sponsor=McNair Center, Kauffman Incubator Project, edegan.com&lt;br /&gt;
}}&lt;br /&gt;
==Short Biography==&lt;br /&gt;
&lt;br /&gt;
Edward J. Egan, Ph.D., is an applied micro-economist. He received his Ph.D. from the [https://haas.berkeley.edu/ Haas School of Business at the University of California, Berkeley] in 2012, where he specialized in business economics, public policy, and finance.&lt;br /&gt;
&lt;br /&gt;
Dr. Egan has previously held positions as a visiting assistant professor at the [https://msb.georgetown.edu/ McDonough School of Business at Georgetown University], the director of the [https://www.bakerinstitute.org/mcnair-center/ McNair Center for Entrepreneurship and Innovation at Rice University’s Baker Institute], an assistant professor of entrepreneurship at [https://www.imperial.ac.uk/business-school/research/innovation-and-entrepreneurship/ Imperial College Business School], and as the innovation policy fellow at the [https://www.nber.org/ National Bureau of Economic Research]. &lt;br /&gt;
&lt;br /&gt;
Much of his academic research is policy-relevant and in the fields of entrepreneurship and innovation. He has received research awards from the [http://www.sshrc-crsh.gc.ca/funding-financement/programs-programmes/fellowships/doctoral-doctorat-eng.aspx Government of Canada] and the [https://www.kauffman.org/microsites/kdf/kauffman-dissertation-fellows Ewing Marion Kauffman Foundation], including most recently a [https://www.kauffman.org/currents/2018/03/uncommon-methods-and-metrics-portfolio-kicks-off-new-research-focus UMM grant]. &lt;br /&gt;
&lt;br /&gt;
Dr. Egan is a serial entrepreneur who co-founded his first high-tech startup at the age of 19. He also worked as a venture capitalist in Vancouver, Canada. Throughout his career, Dr. Egan has been active as an economic advisor to local, state, and federal governments, as well as an economic consultant to firms ranging from pre-incorporation startups to Fortune 500 companies.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48676</id>
		<title>An Economic Research Wiki</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48676"/>
		<updated>2025-05-01T15:12:56Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Switch Sites?|icon=Arrow-left-right-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Cropped-ee-LightGreyOnDarkBlue-1.png|50px|frameless|link=https://www.edegan.com]] Go to the [https://www.edegan.com main site] on edegan.com.&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div&amp;gt;&lt;br /&gt;
==An Economic Research Wiki==&lt;br /&gt;
&amp;lt;div aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:StoweTownHall50pcEdEgan.jpg|right|thumb|320px|This site is hosted in Stowe, Vermont]]&amp;lt;/div&amp;gt;&lt;br /&gt;
This site hosts a [https://www.mediawiki.org mediawiki]-based collaboration platform, which provides a development environment, documentation, and content, for [[affiliate]]d economic researchers, as well as computer scientists and finance professionals. It primarily supports research lead by or conducted in partnership with [[Ed Egan]]. Accordingly, much of the platform is targeted towards research in corporate finance, industrial organization, business economics, and firm strategy, especially as it is applied to the fields of entrepreneurship and innovation. However, the [[tool|tools]], [[data]], and other [[projects|project]] outputs it documents are applicable to many other disciplines and topic areas.&lt;br /&gt;
&lt;br /&gt;
Previous incarnations of this wiki supported the Georgetown University team working on [[Kauffman Incubator Project]], the work done by over [[McNair Team|70 research interns]] at the [[McNair Center for Entrepreneurship and Innovation]] at Rice University's Baker Institute, and the doctoral community at the Haas School of Business, U.C. Berkeley, especially in the [[Business and Public Policy]] group. Much of this content, as well as several hundred pages from prior versions of Ed's personal website, is available as a part of the [[Information Library]]. A variety of [[how-to]]s and [[guide|guides]], including documentation on how to build and configure [[Research Computing Infrastructure]], and other useful resources are also archived there.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;audiences&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Settings-5-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Project Documentation&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-projects&amp;quot; title=&amp;quot;Projects&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Read about existing [[Projects]]&lt;br /&gt;
*Research [[Tool|Tools]]&lt;br /&gt;
*Learn about exciting [[Data]]&lt;br /&gt;
*View [[Featured projects]]&lt;br /&gt;
*[[Form:Project|Start a new project]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Book-open-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Research Articles&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-papers&amp;quot; title=&amp;quot;Papers&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Review [[Paper Development| papers in development]] on this platform&lt;br /&gt;
*Read summaries of important and/or famous [[article|research papers]] &lt;br /&gt;
*View [HTTPS://www.edegan.com/blog published blog articles], or see the development material for [[all McNair blog posts]]&lt;br /&gt;
*View summaries of [[U.S. Federal Legislation]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:user-3-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;User Community&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-users&amp;quot; title=&amp;quot;Users&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Get to know [[Ed Egan]]&lt;br /&gt;
*View the [[team member]] directory&lt;br /&gt;
*See [[Affiliate]] profiles&lt;br /&gt;
*[[Special:RequestAccount|Request an account]] or [[Help:General|Get help!]]&lt;br /&gt;
*[[Research Computing Infrastructure]]&lt;br /&gt;
&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;misc-news&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Government-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Information Library&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=mainpage-library&amp;quot; title=&amp;quot;Information Library&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
Popular [[Information Library|library]] pages include:&lt;br /&gt;
*[[US Startup City Ranking]]&lt;br /&gt;
*[[Addressing Ubuntu NVIDIA Issues]]&lt;br /&gt;
*An [[economic definition of true love]]&lt;br /&gt;
*[[Jim Brander's Rules of Writing]]&lt;br /&gt;
*[[How to effectively hinder learning]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Information-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;News&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-news&amp;quot; title=&amp;quot;News&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div style=&amp;quot;margin: auto; vertical-align:top; text-align:left&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div class=&amp;quot;mainpage_boxcontents_small&amp;quot;&amp;gt;&lt;br /&gt;
; 2020-09&lt;br /&gt;
: The [[Research Computing Configuration| Configuration on Mother]], which hosts this site, was completely updated. Mother now runs Ubuntu 20.04, PhP 7.3, MySQL 8.0.21, Postgresql 12, Mediawiki 1.34, Wordpress 5.5.1, etc. The wiki and the blog also received design upgrades.&lt;br /&gt;
; 2020-09-19&lt;br /&gt;
: Installed [https://www.mediawiki.org/wiki/Extension:HitCounters Extension:HitCounters], so the hit counts start from this date.&lt;br /&gt;
; 2020-10-20&lt;br /&gt;
: Launched the new blog named [https://www.edegan.com/blog The Economics of Growth]. [https://www.edegan.com/blog /blog]is now the landing subdomain.&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Blog Archive|icon=Arrow-right-circle-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Edegandotcomslasharticles-LightGreyOnDarkBlue-155.png|50px|frameless|link=https://www.edegan.com/articles]] Read articles from the [https://www.edegan.com/articles McNair Center Blog Archive].&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
{{#set: Editable by::whitelist|Editable by user::Ed}}&lt;br /&gt;
&lt;br /&gt;
__NOEDITSECTION__&lt;br /&gt;
__NOTOC__&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48675</id>
		<title>An Economic Research Wiki</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48675"/>
		<updated>2025-05-01T15:11:56Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Switch Sites?|icon=Arrow-left-right-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Cropped-ee-LightGreyOnDarkBlue-1.png|50px|frameless|link=https://www.edegan.com]] Go to the [https://www.edegan.com main site] on edegan.com.&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div&amp;gt;&lt;br /&gt;
==An Economic Research Wiki==&lt;br /&gt;
&amp;lt;div aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:StoweTownHall50pcEdEgan.jpg|right|thumb|320px|This site is hosted in Stowe, Vermont]]&amp;lt;/div&amp;gt;&lt;br /&gt;
This site hosts a [https://www.mediawiki.org mediawiki]-based collaboration platform, which provides a development environment, documentation, and content, for [[affiliate]]d economic researchers, as well as computer scientists and finance professionals. It primarily supports research lead by or conducted in partnership with [[Ed Egan]]. Accordingly, much of the platform is targeted towards research in corporate finance, industrial organization, business economics, and firm strategy, especially as it is applied to the fields of entrepreneurship and innovation. However, the [[tool|tools]], [[data]], and other [[projects|project]] outputs it documents are applicable to many other disciplines and topic areas.&lt;br /&gt;
&lt;br /&gt;
Previous incarnations of this wiki supported the Georgetown University team working on [[Kauffman Incubator Project]], the work done by over [[McNair Team|70 research interns]] at the [[McNair Center for Entrepreneurship and Innovation]] at Rice University's Baker Institute, and the doctoral community at the Haas School of Business, U.C. Berkeley, especially in the [[Business and Public Policy]] group. Much of this content, as well as several hundred pages from prior versions of Ed's personal website, is available as a part of the [[Information Library]]. A variety of [[how-to]]s and [[guide|guides]], including documentation on how to build and configure [[Research Computing Infrastructure]], and other useful resources are also archived there.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;audiences&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Settings-5-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Project Documentation&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-projects&amp;quot; title=&amp;quot;Projects&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Read about existing [[Projects]]&lt;br /&gt;
*Research [[Tool|Tools]]&lt;br /&gt;
*Learn about exciting [[Data]]&lt;br /&gt;
*View [[Featured projects]]&lt;br /&gt;
*[[Form:Project|Start a new project]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Book-open-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Research Articles&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-papers&amp;quot; title=&amp;quot;Papers&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Review [[Paper Development| papers in development]] on this platform&lt;br /&gt;
*Read summaries of important and/or famous [[article|research papers]] &lt;br /&gt;
*View [HTTPS://www.edegan.com/blog published blog articles], or see the development material for [[all McNair blog posts]]&lt;br /&gt;
*View summaries of [[U.S. Federal Legislation]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:user-3-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;User Community&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-users&amp;quot; title=&amp;quot;Users&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Get to know [[Ed Egan]]&lt;br /&gt;
*View the [[team member]] directory&lt;br /&gt;
*See [[Affiliate]] profiles&lt;br /&gt;
*[[Special:RequestAccount|Request an account]] or [[Help:General|Get help!]]&lt;br /&gt;
*[[Research Computing Infrastructure]]&lt;br /&gt;
&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;misc-news&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Government-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Information Library&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=mainpage-library&amp;quot; title=&amp;quot;Information Library&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
Popular [[Information Library|library]] pages include:&lt;br /&gt;
*[[US Startup City Ranking]]&lt;br /&gt;
*[[Addressing Ubuntu NVIDIA Issues]]&lt;br /&gt;
*An [[economic definition of true love]]&lt;br /&gt;
*[[Jim Brander's Rules of Writing]]&lt;br /&gt;
*[[How to effectively hinder learning]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Information-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;News&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-news&amp;quot; title=&amp;quot;News&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div style=&amp;quot;margin: auto; vertical-align:top; text-align:left&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div class=&amp;quot;mainpage_boxcontents_small&amp;quot;&amp;gt;&lt;br /&gt;
; 2020-09&lt;br /&gt;
: The [[Research Computing Configuration| Configuration on Mother]], which hosts this site, was completely updated. Mother now runs Ubuntu 20.04, PhP 7.3, MySQL 8.0.21, Postgresql 12, Mediawiki 1.34, Wordpress 5.5.1, etc. The wiki and the blog also received design upgrades.&lt;br /&gt;
; 2020-09-19&lt;br /&gt;
: Installed [https://www.mediawiki.org/wiki/Extension:HitCounters Extension:HitCounters], so the hit counts start from this date.&lt;br /&gt;
; 2020-10-20&lt;br /&gt;
: Launched the new blog named [https://www.edegan.com/blog The Economics of Growth]. [https://www.edegan.com/blog /blog]is now the landing subdomain.&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Switch Sites?|icon=Arrow-right-circle-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Edegandotcomslasharticles-LightGreyOnDarkBlue-155.png|50px|frameless|link=https://www.edegan.com/articles]] Read articles from the '''[https://www.edegan.com/articles McNair Center Blog Archive]'''.&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
{{#set: Editable by::whitelist|Editable by user::Ed}}&lt;br /&gt;
&lt;br /&gt;
__NOEDITSECTION__&lt;br /&gt;
__NOTOC__&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=File:Arrow-right-circle-fill.svg&amp;diff=48674</id>
		<title>File:Arrow-right-circle-fill.svg</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=File:Arrow-right-circle-fill.svg&amp;diff=48674"/>
		<updated>2025-05-01T15:11:43Z</updated>

		<summary type="html">&lt;p&gt;Ed: https://commons.wikimedia.org/wiki/File:Arrow-right-circle-fill.svg&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Summary ==&lt;br /&gt;
https://commons.wikimedia.org/wiki/File:Arrow-right-circle-fill.svg&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48673</id>
		<title>An Economic Research Wiki</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=An_Economic_Research_Wiki&amp;diff=48673"/>
		<updated>2025-05-01T15:07:53Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Switch Sites?|icon=Arrow-left-right-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Cropped-ee-LightGreyOnDarkBlue-1.png|50px|frameless|link=https://www.edegan.com]] Go to the '''[https://www.edegan.com main site on edegan.com]'''.&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div&amp;gt;&lt;br /&gt;
==An Economic Research Wiki==&lt;br /&gt;
&amp;lt;div aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:StoweTownHall50pcEdEgan.jpg|right|thumb|320px|This site is hosted in Stowe, Vermont]]&amp;lt;/div&amp;gt;&lt;br /&gt;
This site hosts a [https://www.mediawiki.org mediawiki]-based collaboration platform, which provides a development environment, documentation, and content, for [[affiliate]]d economic researchers, as well as computer scientists and finance professionals. It primarily supports research lead by or conducted in partnership with [[Ed Egan]]. Accordingly, much of the platform is targeted towards research in corporate finance, industrial organization, business economics, and firm strategy, especially as it is applied to the fields of entrepreneurship and innovation. However, the [[tool|tools]], [[data]], and other [[projects|project]] outputs it documents are applicable to many other disciplines and topic areas.&lt;br /&gt;
&lt;br /&gt;
Previous incarnations of this wiki supported the Georgetown University team working on [[Kauffman Incubator Project]], the work done by over [[McNair Team|70 research interns]] at the [[McNair Center for Entrepreneurship and Innovation]] at Rice University's Baker Institute, and the doctoral community at the Haas School of Business, U.C. Berkeley, especially in the [[Business and Public Policy]] group. Much of this content, as well as several hundred pages from prior versions of Ed's personal website, is available as a part of the [[Information Library]]. A variety of [[how-to]]s and [[guide|guides]], including documentation on how to build and configure [[Research Computing Infrastructure]], and other useful resources are also archived there.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;audiences&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Settings-5-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Project Documentation&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-projects&amp;quot; title=&amp;quot;Projects&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Read about existing [[Projects]]&lt;br /&gt;
*Research [[Tool|Tools]]&lt;br /&gt;
*Learn about exciting [[Data]]&lt;br /&gt;
*View [[Featured projects]]&lt;br /&gt;
*[[Form:Project|Start a new project]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Book-open-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Research Articles&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-papers&amp;quot; title=&amp;quot;Papers&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Review [[Paper Development| papers in development]] on this platform&lt;br /&gt;
*Read summaries of important and/or famous [[article|research papers]] &lt;br /&gt;
*View [HTTPS://www.edegan.com/blog published blog articles], or see the development material for [[all McNair blog posts]]&lt;br /&gt;
*View summaries of [[U.S. Federal Legislation]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:user-3-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;User Community&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-users&amp;quot; title=&amp;quot;Users&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
*Get to know [[Ed Egan]]&lt;br /&gt;
*View the [[team member]] directory&lt;br /&gt;
*See [[Affiliate]] profiles&lt;br /&gt;
*[[Special:RequestAccount|Request an account]] or [[Help:General|Get help!]]&lt;br /&gt;
*[[Research Computing Infrastructure]]&lt;br /&gt;
&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;misc-news&amp;quot; class=&amp;quot;mainpage_row&amp;quot;&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Government-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;Information Library&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=mainpage-library&amp;quot; title=&amp;quot;Information Library&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
Popular [[Information Library|library]] pages include:&lt;br /&gt;
*[[US Startup City Ranking]]&lt;br /&gt;
*[[Addressing Ubuntu NVIDIA Issues]]&lt;br /&gt;
*An [[economic definition of true love]]&lt;br /&gt;
*[[Jim Brander's Rules of Writing]]&lt;br /&gt;
*[[How to effectively hinder learning]]&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div class=&amp;quot;mainpage_box&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;h2&amp;gt;&amp;lt;span class=&amp;quot;header_icon&amp;quot; aria-hidden=&amp;quot;true&amp;quot; role=&amp;quot;presentation&amp;quot;&amp;gt;[[File:Information-fill.svg|36px|middle|link=]]&amp;lt;/span&amp;gt;&amp;lt;span&amp;gt;News&amp;lt;/span&amp;gt;&amp;lt;/h2&amp;gt;&lt;br /&gt;
		&amp;lt;div id=&amp;quot;mainpage-news&amp;quot; title=&amp;quot;News&amp;quot; class=&amp;quot;items&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div style=&amp;quot;margin: auto; vertical-align:top; text-align:left&amp;quot;&amp;gt;&lt;br /&gt;
		&amp;lt;div class=&amp;quot;mainpage_boxcontents_small&amp;quot;&amp;gt;&lt;br /&gt;
; 2020-09&lt;br /&gt;
: The [[Research Computing Configuration| Configuration on Mother]], which hosts this site, was completely updated. Mother now runs Ubuntu 20.04, PhP 7.3, MySQL 8.0.21, Postgresql 12, Mediawiki 1.34, Wordpress 5.5.1, etc. The wiki and the blog also received design upgrades.&lt;br /&gt;
; 2020-09-19&lt;br /&gt;
: Installed [https://www.mediawiki.org/wiki/Extension:HitCounters Extension:HitCounters], so the hit counts start from this date.&lt;br /&gt;
; 2020-10-20&lt;br /&gt;
: Launched the new blog named [https://www.edegan.com/blog The Economics of Growth]. [https://www.edegan.com/blog /blog]is now the landing subdomain.&lt;br /&gt;
		&amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;opacity:0;position:absolute;&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
&amp;lt;templatestyles src=&amp;quot;Template:Main_page/styles.css&amp;quot; /&amp;gt;&lt;br /&gt;
{{Colored box|title=Switch Sites?|icon=Arrow-left-right-fill.svg|content= &amp;lt;h3&amp;gt;[[File:Edegandotcomslasharticles-LightGreyOnDarkBlue-155.png|50px|frameless|link=https://www.edegan.com/articles]] Read articles from the '''[https://www.edegan.com/articles McNair Center Blog Archive]'''.&amp;lt;/h3&amp;gt;}}&lt;br /&gt;
&lt;br /&gt;
{{#set: Editable by::whitelist|Editable by user::Ed}}&lt;br /&gt;
&lt;br /&gt;
__NOEDITSECTION__&lt;br /&gt;
__NOTOC__&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=File:Cropped-ee-LightGreyOnDarkBlue-1.png&amp;diff=48672</id>
		<title>File:Cropped-ee-LightGreyOnDarkBlue-1.png</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=File:Cropped-ee-LightGreyOnDarkBlue-1.png&amp;diff=48672"/>
		<updated>2025-05-01T15:06:38Z</updated>

		<summary type="html">&lt;p&gt;Ed: EdEgan.com favicon (ee)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Summary ==&lt;br /&gt;
EdEgan.com favicon (ee)&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48671</id>
		<title>VentureXpert Data</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48671"/>
		<updated>2025-01-16T16:13:46Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|Has title=VentureXpert Data&lt;br /&gt;
|Has owner=Ed Egan&lt;br /&gt;
|Has start date=2018/06/20&lt;br /&gt;
|Has project output=Data, How-to&lt;br /&gt;
|Has project status=Active&lt;br /&gt;
|Has sponsor=McNair Center&lt;br /&gt;
}}&lt;br /&gt;
The successors to this project include:&lt;br /&gt;
*[[VCDB24]], which is the most recent iteration.&lt;br /&gt;
*[[VCDB23]]&lt;br /&gt;
*[[VCDB20Q3]]&lt;br /&gt;
*[[VCDB20H1]]&lt;br /&gt;
*[[Vcdb4]]&lt;br /&gt;
&lt;br /&gt;
==Relevant Former Projects==&lt;br /&gt;
#[[Venture Capital (Data)]]&lt;br /&gt;
#[[Retrieving US VC Data From SDC]]&lt;br /&gt;
#[[VC Database Rebuild]]&lt;br /&gt;
&lt;br /&gt;
==Location==&lt;br /&gt;
My scripts for SDC pulls are located in the E drive in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ScriptsForSDCExtract&lt;br /&gt;
&lt;br /&gt;
My successfully pulled and normalized files are stored in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ExtractedDataQ2&lt;br /&gt;
&lt;br /&gt;
My scripts for loading tables and data are in:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\vcdb3\LoadingScripts&lt;br /&gt;
&lt;br /&gt;
There are a variety of SQL files in there with self explanatory names. The file that has all of the loading scripts is called LoadingScriptsV1. The folder vcdb2 is there for reference to see what people before had done. ExtractedData is there because I pulled data before July 1st, and Ed asked me to repull the data.&lt;br /&gt;
&lt;br /&gt;
==Goal==&lt;br /&gt;
I will be looking to redesign the VentureXpert Database in a way that is more intuitively built than the previous one. I will also update the database with current data.&lt;br /&gt;
&lt;br /&gt;
==Initial Stages==&lt;br /&gt;
The first step of the project was to figure out what primary keys to use for each major table that I create. I looked at the primary keys used in the creation of the [[VC Database Rebuild]] and found primary keys that are decent. I have updated them and list them below:&lt;br /&gt;
&lt;br /&gt;
#CompanyBaseCore- coname, statecode, datefirstinv&lt;br /&gt;
#IPOCore- issuer, issuedate, statecode&lt;br /&gt;
#MACore- target name, target state code, announceddate&lt;br /&gt;
#Geo - city, statecode, coname, datefirst, year&lt;br /&gt;
#DeadDate - conname, statecode, datefirst, rounddate (tentative could still change)&lt;br /&gt;
#RoundCore- conname, statecode, datefirst, rounddate&lt;br /&gt;
#FirmBaseCore - firmname&lt;br /&gt;
#FundBaseCore - fund name (firstinvedate doesn't work because not every row has an entry)&lt;br /&gt;
&lt;br /&gt;
These are my initial listings and I will come back to update them if needed. &lt;br /&gt;
&lt;br /&gt;
The second part of the initial stage has been to pull data from the SDC Platinum platform. I did it in July to ensure that I had two full quarters of data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SDC Pull==&lt;br /&gt;
When pulling data from SDC, it is a good idea to look for previously made rpt files that have the names of the pulls you will need to do. They have already been created and will save you a lot of work. The rpt files that I used are in the folder VentureXpertDB/ScriptsForSDCExtract. The files will come in pairs with one being saved as an ssh file and one as a rpt file. To update the dates to make them recent, go into the ssh file of the pair and change the date of last investment. When you open SDC, you will be given a variety of choices for which database to pull from. For each type of file chose the following:&lt;br /&gt;
&lt;br /&gt;
#VentureXpert - PortCo, PortCoLong, USVC, Firms, BranchOffices, Funds, Rounds, VCFirmLong&lt;br /&gt;
#Mergres &amp;amp; Acquisition - MAs&lt;br /&gt;
#Global New Issues Databases - IPOs&lt;br /&gt;
&lt;br /&gt;
Help on pulling data from SDC is on the [[SDC Platinum (Wiki)]] page. &lt;br /&gt;
&lt;br /&gt;
===VCFund Pull Problem===&lt;br /&gt;
When pulling the VCFund1980-Present, I encountered two problems. One, is that SDC is not able to sort through the funds that are US only with the built in filters. Two, there are multiple rpt files that specify different variables for the fund pull. I pulled from both to be safe, but in the [[VC Database Rebuild]] page there is a section on the fund pull where Ed specifies which rpt file he used to pull data from SDC. Regardless I have both saved in the ExtractedData folder. After speaking with Ed, he told me to use the VCFund1980-present.rpt file to extract the data. Had various problems extracting data including freezing of SDC program or getting error Out of Memory. Check the [[SDC Platinum (Wiki)]] page to fix these issues.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Loading Tables==&lt;br /&gt;
When I describe errors I encountered, I will not describe them using line numbers. This is because as soon as any data is added, the line numbers will become useless. Instead I recommend that you copy the normalized file you are working with into an excel file and using the filter feature. This way you can find the line number in your specific file that is causing errors and fix it in the file itself. The line numbers that PuTTY errors display are often wrong, so I relied on excel to discover the error fastest. If my instructions are not enough for you to find the error, my advice would be to find key words in the line that PuTTY is telling you is causing errors and filter through excel.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundbase;&lt;br /&gt;
 CREATE TABLE roundbase (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   updateddate date,&lt;br /&gt;
   foundingdate date,&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   description varchar(5000),&lt;br /&gt;
   msa varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   indclass varchar(100),&lt;br /&gt;
   indsubgroup3 varchar(100),&lt;br /&gt;
   indminor varchar(100),&lt;br /&gt;
   url varchar(5000),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   stage1 varchar(100),&lt;br /&gt;
   stage3 varchar(100),&lt;br /&gt;
   rndamtdisck real,&lt;br /&gt;
   rndamtestk real,&lt;br /&gt;
   roundnum integer,&lt;br /&gt;
   numinvestors integer&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundbase FROM 'USVC1980-2018q2-Good.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --151549&lt;br /&gt;
&lt;br /&gt;
The only error I encountered here was with Cardtronic Technology Inc. Here there was a problem with a mixture of quotation marks which cause errors in loading. Find this using the excel trick and remove it manually.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipos;&lt;br /&gt;
 CREATE TABLE ipos (&lt;br /&gt;
   issuedate date,&lt;br /&gt;
   issuer varchar(255),&lt;br /&gt;
   statecode varchar(10), &lt;br /&gt;
   principalamt money, --million&lt;br /&gt;
   proceedsamt money, --sum of all markets in million&lt;br /&gt;
   naiccode varchar(255), --primary NAIC code&lt;br /&gt;
   zipcode varchar(10),&lt;br /&gt;
   status varchar (20),&lt;br /&gt;
   foundeddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY ipos FROM 'IPO1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --12107&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchoffices;&lt;br /&gt;
 CREATE TABLE branchoffices (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   bocity varchar(100),&lt;br /&gt;
   bostate varchar(2),&lt;br /&gt;
   bocountrycode varchar(2),&lt;br /&gt;
   bonation varchar(100),&lt;br /&gt;
   bozip varchar(10),&lt;br /&gt;
   boaddr1 varchar(100),&lt;br /&gt;
   boaddr2 varchar(100)&lt;br /&gt;
 &lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY branchoffices FROM 'USVCFirmBranchOffices1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundline;&lt;br /&gt;
 CREATE TABLE roundline (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   disclosedamt real,&lt;br /&gt;
   fundname varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundline FROM 'USVCRound1980-2018q2-NoFoot-normal-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --403189&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbase;&lt;br /&gt;
 CREATE TABLE fundbase (&lt;br /&gt;
   fundname varchar(255),&lt;br /&gt;
   closedate date, --mm-dd-yyyy&lt;br /&gt;
   lastinvdate date, --mm-dd-yyyy&lt;br /&gt;
   firstinvdate date, --mm-dd-yyyy&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   fundyear varchar(4), --yyyy&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   fundsizem real,&lt;br /&gt;
   fundstage varchar(100),&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   dateinfoupdate date,&lt;br /&gt;
   invtype varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   raisestatus varchar(100),&lt;br /&gt;
   seqnum integer,&lt;br /&gt;
   targetsizefund real,&lt;br /&gt;
   fundtype varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundbase FROM 'VCFund1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --29397&lt;br /&gt;
&lt;br /&gt;
There is a Ukranian fund that has stray quotation marks in its name. It is called something along the lines of &amp;quot;VAT &amp;quot;ZNVKIF &amp;quot;Skhidno-Evropeis'lyi investytsiynyi Fond&amp;quot;. If this does not help, you can filter in excel using Kiev as the keyword in the city column and find the line where you are getting errors. Then manually remove the commas in the actual text file. After that, the table should load correctly.&lt;br /&gt;
 DROP TABLE firmbase;&lt;br /&gt;
 CREATE TABLE firmbase(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   foundingdate date, --mm-dd-yyyy&lt;br /&gt;
   datefirstinv date, --mm-dd-yyyy  &lt;br /&gt;
   datelastinv date, --mm-dd-yyyy&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   location varchar(100),&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   areacode integer,&lt;br /&gt;
   county varchar(100),&lt;br /&gt;
   state varchar(2),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   nation varchar(100),&lt;br /&gt;
   worldregion varchar(100),&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   numrounds integer,&lt;br /&gt;
   investedk money,&lt;br /&gt;
   capitalundermgmt money,  &lt;br /&gt;
   invstatus varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   rolepref varchar(100),&lt;br /&gt;
   geogpref varchar(100),&lt;br /&gt;
   indpref varchar(100),&lt;br /&gt;
   stagepref varchar(100),&lt;br /&gt;
   type varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 \COPY firmbase FROM 'USVCFirms1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --15899&lt;br /&gt;
&lt;br /&gt;
The normalization for this file was wrong when I tried to load the data. To fix this go to the file where you have removed the footer and find the column header titled Firm Capital under Mgmt{0Mil}. Delete the {0mil} and renormalize the file. Then everything should be ok. A good way to check this is to copy and paste the normalized file into an excel sheet and see whether the entries line up with their column header correctly. &lt;br /&gt;
The second error I found was with the Kerala Ventures firm. Here the address has the word l&amp;quot;opera in it. This quotation will cause errors so find the line number using excel and remove it manually.&lt;br /&gt;
The third error is in an area code where 1-8 is written. This hyphen causes errors. Interestingly, the line number given by PuTTY was correct, and I found it in my text file and deleted it manually.&lt;br /&gt;
These were the only errors I encountered while loading this table.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE mas;&lt;br /&gt;
 CREATE TABLE mas (&lt;br /&gt;
   announceddate date,&lt;br /&gt;
   effectivedate date,&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   targetstate varchar(100),&lt;br /&gt;
   acquirorname varchar(255),&lt;br /&gt;
   acquirorstate varchar(100),&lt;br /&gt;
   transactionamt money,&lt;br /&gt;
   enterpriseval varchar(255),&lt;br /&gt;
   acquirorstatus varchar(150)&lt;br /&gt;
 );&lt;br /&gt;
 \COPY mas FROM 'MAUSTargetComp100pc1985-July2018-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --119432&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading in this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE longdescription;&lt;br /&gt;
 CREATE TABLE longdescription(&lt;br /&gt;
   varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   fundingdate date, --date co received first inv&lt;br /&gt;
   codescription varchar(10000) --long description&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY longdescription FROM 'PortCoLongDesc-Ready-normal-fixed.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --48037&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading this data.&lt;br /&gt;
&lt;br /&gt;
==Cleaning Companybase, Fundbase, Firmbase, and BranchOffice==&lt;br /&gt;
===Cleaning Company===&lt;br /&gt;
The primary key for port cos will be coname, datefirstinv, and statecode. Before checking whether this is a valid primary key, remove the undisclosed companies. I will explain the second part of the query concerning New York Digital Health later. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE companybasecore;&lt;br /&gt;
 CREATE TABLE companybasecore AS&lt;br /&gt;
 SELECT * &lt;br /&gt;
 FROM Companybase WHERE nationcode = 'US' AND coname != 'Undisclosed Company' &lt;br /&gt;
 AND NOT (coname='New York Digital Health LLC' AND statecode='NY' AND datefirstinv='2015-08-13' AND updateddate='2015-10-20');&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT coname, statecode, datefirstinv FROM companybasecore) AS T;&lt;br /&gt;
 --48001&lt;br /&gt;
Since the count of the table and the count of the distinct primary key is equivalent, you know that the primary key is valid. In the initial cleaning of the table, I first sorted out only the undisclosed companies. This table had 48002 rows. I then ran the DISTINCT query above and found that there are 48001 distinct rows with the coname, datefirstinv, statecode primary key. Thus there must two rows that share a primary key. I found this key using the following query:&lt;br /&gt;
&lt;br /&gt;
 SELECT * FROM (SELECT coname, datefirstinv, statecode FROM companybase) as key GROUP BY coname, datefirstinv, statecode HAVING COUNT(key) &amp;gt; 1;&lt;br /&gt;
&lt;br /&gt;
The company named 'New York Digital Health LLC' came up as the company that is causing the problems. I queried to find the two rows that list this company name in companybase and chose to keep the row that had the earlier updated date. It is a good practice to avoid deleting rows from tables when possible, so I added the filter as a WHERE clause to exclude one of the New York Digital listings.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Fundbase===&lt;br /&gt;
The primary key for funds will be only the fundname. First get rid of all of the undisclosed funds. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasenound;&lt;br /&gt;
 CREATE TABLE fundbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM fundbase WHERE fundname NOT LIKE '%Undisclosed Fund%';&lt;br /&gt;
 --28886&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT fundname FROM fundbasenound)a;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
As you can see, fundbase still has rows that share fundnames. If you are wondering why the DISTINCT in the first query did not eliminate these, it is because this DISTINCT applies to the whole row not individual fundnames. Thus, only completely duplicate rows will be eliminated in the first query. I chose to keep the funds that have the earlier last investment date. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundups;&lt;br /&gt;
 CREATE TABLE fundups AS SELECT&lt;br /&gt;
 fundname, max(lastinvdate) AS lastinvdate FROM fundbasenound GROUP BY fundname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --53&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasecore;&lt;br /&gt;
 CREATE TABLE fundbasecore AS&lt;br /&gt;
 SELECT A.* FROM fundbasenound AS A LEFT JOIN fundups AS B ON A.fundname=B.fundname AND A.lastinvdate=B.lastinvdate WHERE B.fundname IS NULL AND B.lastinvdate IS NULL;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
Since the count of fundbasecore is the same as the number of distinct fund names, we know that the fundbasecore table is clean. In the first query I am finding duplicate rows and choosing the row that has the greater last investment date. I then match this table back to fundbasenound but choose all the rows from fundbasecore for which there is no corresponding fund in fundups based on fund name and date of last investment. This allows the funds with the earlier date of last investment to be chosen.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Firmbase===&lt;br /&gt;
The primary key for firms will be firm name. First I got rid of all undisclosed firms. I also filtered out two firms that have identical firm names and founding dates. The reason for this is because I use founding dates to filter out duplicate firm names. If there are two rows that have the same firm name and founding date, they will not be filtered out by the third query below. Thus, I chose to filter those out completely.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasenound;&lt;br /&gt;
 CREATE TABLE firmbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM firmbase WHERE firmname NOT LIKE '%Undisclosed Firm%' AND firmname NOT LIKE '%Amundi%' AND firmname NOT LIKE '%Schroder Adveq Management%';&lt;br /&gt;
 --15452&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM firmbasenound)a;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since these counts are not equal we will have to clean the table further. We will use the same method from before.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmdups;&lt;br /&gt;
 CREATE TABLE firmdups AS SELECT&lt;br /&gt;
 firmname, max(foundingdate) as foundingdate FROM firmbasenound GROUP BY firmname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --15&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasecore;&lt;br /&gt;
 CREATE TABLE firmbasecore AS&lt;br /&gt;
 SELECT A.* FROM firmbasenound AS A LEFT JOIN firmdups AS B ON A.firmname=B.firmname AND A.foundingdate=B.foundingdate WHERE B.firmname IS NULL AND B.foundingdate IS NULL;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since the count of firmbasecore and the DISTINCT query are the same, the firm table is now clean.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Branch Offices===&lt;br /&gt;
When cleaning the branch offices, I had to remove all duplicates in the table. This is because the table is so sparse that often the only data in a row would be the fund name the branch was associated with. Thus, I couldn't filter based on dates as I had been doing previously for firms and funds. The primary key is firm name.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bonound;&lt;br /&gt;
 CREATE TABLE bonound AS&lt;br /&gt;
 SELECT *, CASE WHEN firmname LIKE '%Undisclosed Firm%' THEN 1::int ELSE 0::int END AS undisclosedflag&lt;br /&gt;
 FROM branchoffices;&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM bonound)a;&lt;br /&gt;
 --10042&lt;br /&gt;
&lt;br /&gt;
Since these counts aren't the same, we will have to work a little more to clean the table. As stated above, I did this by excluding the firm names that were duplicated.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchofficecore;&lt;br /&gt;
 CREATE TABLE branchofficecore AS&lt;br /&gt;
 SELECT A.* FROM bonound AS A JOIN (&lt;br /&gt;
 		SELECT bonound.firmname, COUNT(*) FROM bonound GROUP BY firmname&lt;br /&gt;
 		HAVING COUNT(*) =1&lt;br /&gt;
 		) AS B&lt;br /&gt;
 ON A.firmname=B.firmname WHERE undisclosedflag=0;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT firmname FROM branchofficecore)a;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Since these counts are the same, we are good to go. The count is 10 lower because we completely removed 10 firmnames from the listing by throwing out the duplicates.&lt;br /&gt;
&lt;br /&gt;
==Instructions on Matching PortCos to Issuers and M&amp;amp;As From Ed==&lt;br /&gt;
===Company Standardizing===&lt;br /&gt;
&lt;br /&gt;
Get portco keys&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeys;&lt;br /&gt;
 CREATE TABLE portcokey AS&lt;br /&gt;
 SELECT coname, statecode, datefirst&lt;br /&gt;
 FROM portcocore;&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcocore OR THESE KEYS ARE VALID AND FIX THAT FIRST&lt;br /&gt;
&lt;br /&gt;
Get distinct coname and put it in a file&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT coname FROM portcokeys) TO 'DistinctConame.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
&lt;br /&gt;
Match that to itself&lt;br /&gt;
 Move DistinctConame.txt to E:\McNair\Software\Scripts\Matcher\Input&lt;br /&gt;
 Open powershell and change directory to E:\McNair\Software\Scripts\Matcher&lt;br /&gt;
 Run the matcher in mode2:&lt;br /&gt;
  perl Matcher.pl -file1=&amp;quot;DistinctConame.txt&amp;quot; -file2=&amp;quot;DistinctConame.txt&amp;quot; -mode=2&lt;br /&gt;
 Pick up the output file from E:\McNair\Software\Scripts\Matcher\Output (it is probably called DistinctConame.txt-DistinctConame.txt.matched) and move it to your Z drive directory&lt;br /&gt;
 &lt;br /&gt;
Load the matches into the dbase&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortcoStd;&lt;br /&gt;
 CREATE TABLE PortcoStd (&lt;br /&gt;
    conamestd  varchar(255),&lt;br /&gt;
    coname   varchar(255),&lt;br /&gt;
    norm  varchar(100),&lt;br /&gt;
    x1  varchar(255),&lt;br /&gt;
    x2  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY CohortCoStd FROM 'DistinctConame.txt-DistinctConame.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --YOUR COUNT&lt;br /&gt;
 &lt;br /&gt;
Join the Conamestd back to the portcokeys table to create your matching table&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeysstd;&lt;br /&gt;
 CREATE TABLE portcokeysstd AS&lt;br /&gt;
 SELECT B.conamestd, A.*&lt;br /&gt;
 FROM portcokey AS A&lt;br /&gt;
 JOIN PortcoStd AS B ON A.coname=B.coname&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcokey OR YOU LOST SOME NAMES OR INFLATED THE DATA&lt;br /&gt;
 &lt;br /&gt;
Put that in a file for matching (conamestd is in first column by construction)&lt;br /&gt;
&lt;br /&gt;
  \COPY portcokeysstd TO 'PortCoMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --YOUR COUNT&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===MA Cleaning and Matching===&lt;br /&gt;
First remove all of the duplicates in the MA data. Do this by running aggregate queries on every column except for the primary key:&lt;br /&gt;
 DROP TABLE MANoDups;&lt;br /&gt;
 CREATE TABLE MANoDups AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate, min(effectivedate) AS effectivedate, MIN(acquirorname) as acquirorname, MIN(acquirorstate) as acquirorstate, MAX(transactionamt) as &lt;br /&gt;
 transactionamt, MAX(enterpriseval) as enterpriseval, MIN(acquirorstatus) as acquirorstatus&lt;br /&gt;
 FROM mas &lt;br /&gt;
 GROUP BY targetname, targetstate, announceddate ORDER BY targetname, targetstate, announceddate;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM manodups)a;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
Since these counts are equivalent, the data set is clean. Then get all the primary keys from the table and copy the distinct target names into a text file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makey;&lt;br /&gt;
 CREATE TABLE makey AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate&lt;br /&gt;
 FROM manodups;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT targetname FROM makey) TO 'DistinctTargetName.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV;&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
After running this list of distinct target names through the matcher, put the standardized MA list into the data base.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MaStd;&lt;br /&gt;
 CREATE TABLE MaStd (&lt;br /&gt;
   targetnamestd varchar(255),&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   norm varchar(100),&lt;br /&gt;
   x1 varchar(255),&lt;br /&gt;
   x2 varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY mastd FROM 'DistinctTargetName.txt-DistinctTargetName.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
Then match the list of standardized names back to the makey table to get a table with standardized keys and primary keys. This will be your input for matching against port cos. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makeysstd;&lt;br /&gt;
 CREATE TABLE makeysstd AS&lt;br /&gt;
 SELECT B.targetnamestd, A.*&lt;br /&gt;
 FROM makey AS A&lt;br /&gt;
 JOIN mastd AS B ON A.targetname=B.targetname;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
  \COPY makeysstd TO 'MAMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --119374&lt;br /&gt;
&lt;br /&gt;
Use this text file to match against the PortCoMatchInput. Your job will be to determine whether the matches between the MAs and PortCos are true matches. The techniques that I used are described in the section below.&lt;br /&gt;
&lt;br /&gt;
===IPO Cleaning and Matching===&lt;br /&gt;
The process is the same for IPOs.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE iponodups;&lt;br /&gt;
 CREATE TABLE iponodups&lt;br /&gt;
 AS SELECT issuer, statecode, issuedate, MAX(principalamt) AS principalamt, MAX(proceedsamt) AS proceedsamt, MIN(naiccode) as naicode, MIN(zipcode) AS zipcode, MIN(status) AS status, &lt;br /&gt;
 MIN(foundeddate) AS foundeddate&lt;br /&gt;
 FROM ipos GROUP BY issuer, statecode, issuedate ORDER BY issuer, statecode, issuedate; &lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuer, statecode, issuedate FROM iponodups)a;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeys;&lt;br /&gt;
 CREATE TABLE ipokeys AS&lt;br /&gt;
 SELECT issuer, statecode, issuedate&lt;br /&gt;
 FROM iponodups;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT issuer FROM ipokeys) TO 'IPODistinctIssuer.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeysstd;&lt;br /&gt;
 CREATE TABLE ipokeysstd (&lt;br /&gt;
    issuerstd varchar(255),&lt;br /&gt;
    issuer varchar(255),&lt;br /&gt;
    norm varchar(100),&lt;br /&gt;
    x1 varchar(255),&lt;br /&gt;
    x2 varchar(255)&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY ipokeysstd FROM 'IPODistinctIssuer.txt-IPODistinctIssuer.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipostd;&lt;br /&gt;
 CREATE TABLE ipostd AS&lt;br /&gt;
 SELECT B.issuerstd, A.*&lt;br /&gt;
 FROM ipokeys AS A&lt;br /&gt;
 JOIN ipokeysstd AS B ON A.issuer=B.issuer;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY ipostd TO 'IPOMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
As with MA, match this file against the PortCoMatchInput file without mode 2. Then manually check the matches using the techniques described below.&lt;br /&gt;
&lt;br /&gt;
I generally use MAX for amounts and MIN for dates. I also chose to use MIN on text strings.&lt;br /&gt;
&lt;br /&gt;
==Cleaning IPO and MA Data==&lt;br /&gt;
It is important to follow Ed's direction of cleaning the data using aggregate function before putting the data into excel. This will keep you from a lot of manual checking that is unnecessary. When ready, paste the data you have into an excel file. In that excel file, I made three columns: one to check whether state codes were equivalent, one checking whether the date of first investment was 3 years before the MA or IPO, and one checking whether both of these conditions were satisfied for each company. I did this using simple if statements. This process is manual checking and filtering to see whether matches are correct or not and are thus extremely subjective and tedious. First, I went through and checked the companies that did not have equivalent state codes. If the company was one that I knew or the name was unique to the point that I did not believe the same name would appear in another state, I marked the state codes as equivalent. I did the same for the date of first investment vs MA/IPO date. Then I removed all duplicates that had the marking Warning Multiple Matches, and the data sheets were clean.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Process For Creating the PortCoExits Table==&lt;br /&gt;
===MA Process===&lt;br /&gt;
First we must load the clean, manually checked tables back into the database. &lt;br /&gt;
 DROP TABLE MAClean;&lt;br /&gt;
 CREATE TABLE MAClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  targetnamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  targetname varchar(255),&lt;br /&gt;
  targetstate varchar(10),&lt;br /&gt;
  announceddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY MAClean FROM 'MAClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --7205&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM MAClean)a;&lt;br /&gt;
 --7188&lt;br /&gt;
&lt;br /&gt;
As you can see there are still duplicate primary keys in the table. To get rid of these I wrote a query that chooses primary keys that occur only once and matches them against MANoDups. That way you will have unique primary keys by construction.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MACleanNoDups;&lt;br /&gt;
 CREATE TABLE MACleanNoDups AS&lt;br /&gt;
 SELECT A.*, effectivedate, transactionamt, enterpriseval, acquirorstatus&lt;br /&gt;
 FROM MAClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT targetname, targetstate, announceddate, COUNT(*) FROM MAClean&lt;br /&gt;
 	GROUP BY targetname, targetstate, announceddate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.targetname=B.targetname AND A.targetstate=B.targetstate AND A.announceddate=B.announceddate&lt;br /&gt;
 LEFT JOIN MANoDups AS C ON A.targetnamestd=C.targetname AND A.announceddate=C.announceddate;&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM MACleanNoDups)a;&lt;br /&gt;
 --7171&lt;br /&gt;
&lt;br /&gt;
Thus the portco primary key is unique in the table. We will use this later. &lt;br /&gt;
Now do the same for the IPOs.&lt;br /&gt;
&lt;br /&gt;
===IPO Process===&lt;br /&gt;
 DROP TABLE IPOClean;&lt;br /&gt;
 CREATE TABLE IPOClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  issuernamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  issuername varchar(255),&lt;br /&gt;
  issuerstate varchar(10),&lt;br /&gt;
  issuedate date&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY IPOClean FROM 'IPOClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2146&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuername, issuerstate, issuedate FROM IPOClean)a;&lt;br /&gt;
 --2141&lt;br /&gt;
&lt;br /&gt;
As with the MA process, there were duplicates in the clean IPO table. Get rid of these using the same process as with MAs. Only choose the primary keys that occur once and join these to the IPONoDups table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOCleanNoDups;&lt;br /&gt;
 CREATE TABLE IPOCleanNoDups AS&lt;br /&gt;
 SELECT A.*, principalamt, proceedsamt, naicode as naics, zipcode, status, foundeddate&lt;br /&gt;
 FROM IPOClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT issuername, issuerstate, issuedate, COUNT(*) FROM IPOClean&lt;br /&gt;
 	GROUP BY issuername, issuerstate, issuedate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.issuername=B.issuername AND A.issuerstate=B.issuerstate AND A.issuedate=B.issuedate&lt;br /&gt;
 LEFT JOIN IPONoDups AS C ON A.issuername=C.issuer AND A.issuerstate=C.statecode AND A.issuedate=C.issuedate;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM IPOCleanNoDups)a;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
Now the duplicates are out of the MAClean and IPOClean data and we can start to construct the ExitKeysClean table.&lt;br /&gt;
&lt;br /&gt;
==Creating ExitKeysClean==&lt;br /&gt;
&lt;br /&gt;
First I looked for the PortCos that were in both the MAs and the IPOs. I did this using:&lt;br /&gt;
 DROP TABLE IPOMAForReview;&lt;br /&gt;
 CREATE TABLE IPOMAForReview&lt;br /&gt;
 SELECT A.*, B.targetname, B.targetstate, B.announcedate&lt;br /&gt;
 FROM IPOCleanNoDups AS A&lt;br /&gt;
 JOIN MACleanNoDups AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv;&lt;br /&gt;
 --92&lt;br /&gt;
&lt;br /&gt;
I then pulled out the IPOs that were only IPOs and MAs that were only MAs. I also added in a column that indicated whether a company underwent an IPO or a MA.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPONoConflict;&lt;br /&gt;
 CREATE TABLE IPONoConflict AS&lt;br /&gt;
 SELECT A.*, 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOCleanNoDups AS A &lt;br /&gt;
 LEFT JOIN MACleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv &lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --2044&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MANoConflict;&lt;br /&gt;
 CREATE TABLE MANoConflict AS&lt;br /&gt;
 SELECT A.*, 0::int as IPOvsMA&lt;br /&gt;
 FROM MACleanNoDups AS A&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.Coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --7079&lt;br /&gt;
&lt;br /&gt;
Since 2136-92=2044 and 7171-92=7079, we know that the duplicate companies were extracted successfully.&lt;br /&gt;
&lt;br /&gt;
I then wrote a query to check whether the IPO issue date or announced date of the MA was earlier and used that to indicate whether I chose the company to have undergone an MA or an IPO in the column MSvsIPO. A 0 in the column represented an MA being chosen and a 1 represented an IPO being chosen.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Then out of this table I extracted the MAs and IPOs using the the created MAvsIPO flag:&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MASelected;&lt;br /&gt;
 CREATE TABLE MASelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 targetname, targetstate, announceddate,&lt;br /&gt;
 0::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;gt;= announceddate;&lt;br /&gt;
 --25&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOSelected;&lt;br /&gt;
 CREATE TABLE IPOSelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 issuername, issuerstate, issuedate,&lt;br /&gt;
 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;lt; announceddate;&lt;br /&gt;
 --67&lt;br /&gt;
&lt;br /&gt;
I then made the ExitKeysClean table using the portco primary key and the indicator MAvsIPO indicator column.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ExitKeys;&lt;br /&gt;
 CREATE TABLE ExitKeys AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, ipovsma FROM IPONoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM IPOSelected&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MANoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MASelected;&lt;br /&gt;
 --9215&lt;br /&gt;
&lt;br /&gt;
==Create the PortCoExit And PortCoAliveDead Tables==&lt;br /&gt;
From consulting with Ed and the VC Database Rebuild wiki, I decided to make the PortCoExit table with an mavsipo, an exitdate, an exited, and an exitvalue column. I use the MAvsIPO column to add in data. It is very important that you have constructed this column.&lt;br /&gt;
 DROP TABLE PortCoExit;&lt;br /&gt;
 CREATE TABLE PortCoExit AS&lt;br /&gt;
 SELECT A.coname, A.statecode, A.datefirstinv, A.datelastinv, A.city, B.ipovsma,&lt;br /&gt;
 CASE WHEN B.ipovsma IS NOT NULL THEN 1::int ELSE 0::int END AS Exit,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.proceedsamt::numeric WHEN ipovsma=0 THEN D.transactionamt::numeric ELSE NULL::numeric END AS exitvaluem,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.issuedate WHEN ipovsma=0 THEN D.announceddate ELSE NULL::date END AS exitdate,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN extract(year from C.issuedate) WHEN ipovsma=0 THEN extract(year from D.announceddate) ELSE NULL::int END AS exityear&lt;br /&gt;
 FROM companybasecore AS A&lt;br /&gt;
 LEFT JOIN ExitKeys AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS C ON A.coname=C.coname AND A.statecode=C.statecode AND A.datefirstinv=C.datefirstinv&lt;br /&gt;
 LEFT JOIN MACleanNoDups AS D ON A.coname=D.coname AND A.statecode=D.statecode AND A.datefirstinv=D.datefirstinv;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
I then used this table to build one that has information as to whether a company was dead or alive. I found this information by checking whether a company had undergone an IPO or MA, indicating the company was dead. Alternatively, if the company's date of last investment was more than 5 years ago, I marked the company as dead.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortCoAliveDead;&lt;br /&gt;
 CREATE TABLE PortCoAliveDead AS&lt;br /&gt;
 SELECT *, &lt;br /&gt;
 datefirstinv as alivedate, extract(year from datefirstinv) as aliveyear,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exitdate &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN (datelastinv + INTERVAL '5 year') &lt;br /&gt;
 	ELSE NULL::date END AS deaddate,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exityear &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN extract(year from (datelastinv + INTERVAL '5 year')) &lt;br /&gt;
 	ELSE NULL::int END AS deadyear&lt;br /&gt;
 FROM PortCoExit;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
==GeoCoding Companies, Firms, and Branch Offices==&lt;br /&gt;
A helpful page here is the [[Geocode.py]] page which explains how to use the Geocoding script. You will have to tweak the Geocode script when geocoding as each of these tables has a different primary key. It is vital that you include the primary keys in the file you input and output from the Geocoding script. Without these, you will not be able to join the latitudes and longitudes back to the firm, branch office, or company base tables.&lt;br /&gt;
&lt;br /&gt;
Geocoding costs money since we are using the Google Maps API. The process doesn't cost much, but in order to save money I tried to salvage as much of the preexisting geocode information I could find.&lt;br /&gt;
===Companies===&lt;br /&gt;
I found the table of old companies with latitudes and longitudes in vcdb2 and loaded these into vcdb3.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldgeocords;&lt;br /&gt;
 CREATE TABLE oldgeocords (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   ivestedk real,&lt;br /&gt;
   city varchar(255),&lt;br /&gt;
   addr1 varchar(255),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY oldgeocords FROM 'companybasegeomaster.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
The API occasionally will give erroneous latitude and longitude readings. In order to catch only the good ones, I found the latitude and longitude lines that encompass the mainland US and created an exclude flag to make sure companies were in this box. I then created flags to include companies in Puerto Rico, Hawaii, and Alaska. Companies that were in these places often had wrong latitude and longitude readings of 44.93, 7.54, so I ran a query making sure that these weren't listed. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords;&lt;br /&gt;
 CREATE TABLE geoallcoords AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldgeocords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords1;&lt;br /&gt;
 CREATE TABLE geoallcoords1 AS SELECT&lt;br /&gt;
 *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
I then included only companies that were either in the mainland US, Hawaii, Alaska, or Puerto Rico. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeoold;&lt;br /&gt;
 CREATE TABLE goodgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords1 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --38498&lt;br /&gt;
&lt;br /&gt;
I then found the remaining companies that needed to be geocoded. Only companies that have addresses listed are able to be accurately geocoded. If we attempt to geocode based on city, the location returned will simply be the center of the city. Thus, I chose the companies that we did not already have listings for and had a valid address.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaininggeo;&lt;br /&gt;
 CREATE TABLE remaininggeo AS SELECT A.coname, A.statecode, A.datefirstinv, A.addr1, A.addr2, A.city, A.zip FROM companybasecore AS A LEFT JOIN goodgeoold AS B ON A.coname=B.coname &lt;br /&gt;
 AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.coname IS NULL AND A.addr1 IS NOT NULL;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 \COPY remaininggeo TO 'RemainingGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I copied this table into excel to concatenate the address, city, state, and zipcode columns into one column. This can and should be done in SQL, but I was not aware this could be done. I then ran remaininggeo through the Geocode script with columns coname, statecode, datefirstinv, and address in the inputted file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaining;&lt;br /&gt;
 CREATE TABLE remaining (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY remaining FROM 'RemainingLatLong.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I then ran the same geographical checks on the newly geocoded companies and found all of the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords2;&lt;br /&gt;
 CREATE TABLE geoallcoords2 AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM remaining;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords3;&lt;br /&gt;
 CREATE TABLE geoallcoords3 AS&lt;br /&gt;
 SELECT *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int     0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords2;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeonew;&lt;br /&gt;
 CREATE TABLE goodgeonew AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords3 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5913&lt;br /&gt;
&lt;br /&gt;
I then combined the old and new geocodes and matched them back to the company base table to get a geo table for companies.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geocodesportco;&lt;br /&gt;
 CREATE TABLE geocodesportco AS SELECT&lt;br /&gt;
 A.* from goodgeonew &lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT B.* from goodgeoold;&lt;br /&gt;
 --44411&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcogeo;&lt;br /&gt;
 CREATE TABLE portcogeo AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude FROM companybasecore AS A LEFT JOIN Geocodesportco AS B ON A.coname=B.coname AND A.datefirstinv=B.datefirstinv AND A.statecode=B.statecode;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
===Firms===&lt;br /&gt;
This process is largely the same as for companies. I found old firms that had already been geocoded and checked for accuracy.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldfirmcoords;&lt;br /&gt;
 CREATE TABLE oldfirmcoords (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY oldfirmcoords FROM 'FirmCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmoldfilter;&lt;br /&gt;
 CREATE TABLE firmoldfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldfirmcoords;&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
Since oldfirmcoords does not have state codes, we have to find a way to include state codes to add in companies based in Puerto Rico, Hawaii, and Alaska. I did this by matching the firmoldfilter table back to the firm base table. I used the coalesce function because we wanted to exclude companies that we had not geocoded due to faulty addresses. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch1;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch1 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmbasecore AS A LEFT JOIN firmoldfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Then the process of tagging the PR, HI, and AK companies and including only correctly tagged companies is the same as for companies. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternal;&lt;br /&gt;
 CREATE TABLE firmcoordsexternal AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch1;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeoold;&lt;br /&gt;
 CREATE TABLE goodfirmgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternal AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5346&lt;br /&gt;
&lt;br /&gt;
Find the remaining firms and run the geocode script on these firms&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remainingfirm;&lt;br /&gt;
 CREATE TABLE remainingfirm AS SELECT A.firmname, A.addr1, A.addr2, A.city, A.state, A.zip FROM firmcoreonedupremoved AS A LEFT JOIN goodfirmgeoold AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE B.firmname IS NULL AND A.addr1 IS NOT NULL AND A.msacode!='9999';&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 \COPY remainingfirm TO 'FirmGeoRemaining.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmremainingcoords;&lt;br /&gt;
 CREATE TABLE firmremainingcoords(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY firmremainingcoords FROM 'FirmRemainingCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
Follow the same filtering process as above to get the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmnewfilter;&lt;br /&gt;
 CREATE TABLE firmnewfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM firmremainingcoords;&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch2;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch2 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmcoreonedupremoved AS A LEFT JOIN firmnewfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternalremaining;&lt;br /&gt;
 CREATE TABLE firmcoordsexternalremaining AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch2;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeonew;&lt;br /&gt;
 CREATE TABLE goodfirmgeonew AS SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternalremaining AS B &lt;br /&gt;
 ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --703&lt;br /&gt;
&lt;br /&gt;
Combine the old and new geocoded firms and match them to firm base to get a firm geo table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocoords;&lt;br /&gt;
 CREATE TABLE firmgeocoords AS&lt;br /&gt;
 SELECT * FROM goodfirmgeonew&lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT * FROM goodfirmgeoold;&lt;br /&gt;
 --6049&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocore;&lt;br /&gt;
 CREATE TABLE firmgeocore AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude FROM firmbasecore AS A LEFT JOIN firmgeocoords AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
===Branch Offices===&lt;br /&gt;
I did not use old branch office data because I could not find it anywhere in the old data set. I have since found old data in the table firmbasecoords in vcdb2. &lt;br /&gt;
&lt;br /&gt;
First copy all of the needed data out of the database to do geocoding.&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT A.firmname, A.boaddr1, A.boaddr2, A.bocity, A.bostate, A.bozip FROM bonound AS A WHERE A.boaddr1 IS NOT NULL) TO 'BranchOffices.txt' WITH DELIMITER AS E'\t' HEADER &lt;br /&gt;
 NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
Then load the data into the database and follow the same filtering process as above.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo;&lt;br /&gt;
 CREATE TABLE bogeo (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY bogeo FROM 'BranchOfficesGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo1;&lt;br /&gt;
 CREATE TABLE bogeo1 AS SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM bogeo;&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bomatchgeo;&lt;br /&gt;
 CREATE TABLE bomatchgeo AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM branchofficecore AS A LEFT JOIN bogeo1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo2;&lt;br /&gt;
 CREATE TABLE bogeo2 AS&lt;br /&gt;
 SELECT *, CASE WHEN bostate='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN bostate='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN bostate='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM bomatchgeo;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Match the correctly geocoded branch offices back to firm base to get the final table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeocore1;&lt;br /&gt;
 CREATE TABLE bogeocore1 AS&lt;br /&gt;
 SELECT * FROM bogeo2 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --1161&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbogeo;&lt;br /&gt;
 CREATE TABLE firmbogeo AS&lt;br /&gt;
 SELECT A.*, B.latitude AS BOLatitude, B.longitude AS BOLongitude FROM firmgeocore AS A LEFT JOIN bogeocore1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
==Creating People Tables==&lt;br /&gt;
We pulled data on executives in both portcos and funds. I describe the process below. If any of the explanations don't make sense, I also describe most tables in the section called Marcos's Code.&lt;br /&gt;
===Company People===&lt;br /&gt;
 DROP TABLE titlelookup;&lt;br /&gt;
 CREATE TABLE titlelookup(&lt;br /&gt;
 	fulltitle varchar(150),&lt;br /&gt;
 	charman int, &lt;br /&gt;
 	ceo int,&lt;br /&gt;
 	cfo int,&lt;br /&gt;
 	coo int,&lt;br /&gt;
 	cio int,&lt;br /&gt;
 	cto int,&lt;br /&gt;
 	otherclvl int,&lt;br /&gt;
 	boardmember int,&lt;br /&gt;
 	president int,&lt;br /&gt;
 	vp int,&lt;br /&gt;
 	founder int,&lt;br /&gt;
 	director int&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY titlelookup FROM 'Important Titles in Women2017 dataset.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --628&lt;br /&gt;
&lt;br /&gt;
This table lists various titles one can have and identifies where they fall under traditional executive titles.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeople;&lt;br /&gt;
 CREATE TABLE copeople(&lt;br /&gt;
 	datefirstinv   date,&lt;br /&gt;
 	cname varchar(150),&lt;br /&gt;
 	statecode  varchar(2),&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	nonmanaging  varchar(1),&lt;br /&gt;
 	prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY copeople FROM 'Executives-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
This table gets various executives from portcos. This is loaded from SDC. Next we have to identify which traditional executive level job the listed job title corresponds to. It also identifies whether a prefix identifies an executive as male or female. I made a stupid mistake of writing cname instead of coname when loading in the data. If you want to save yourself work, write coname.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeoplebase;&lt;br /&gt;
 CREATE TABLE copeoplebase AS&lt;br /&gt;
 SELECT copeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson,&lt;br /&gt;
 CASE WHEN fulltitle IS NOT NULL THEN 1::int ELSE 0::int END AS hastitlelookup,&lt;br /&gt;
 CASE WHEN charman IS NOT NULL THEN charman ELSE 0::int END AS chairman,&lt;br /&gt;
 CASE WHEN ceo IS NOT NULL THEN ceo ELSE 0::int END AS ceo,&lt;br /&gt;
 CASE WHEN cfo IS NOT NULL THEN cfo ELSE 0::int END AS cfo,&lt;br /&gt;
 CASE WHEN coo IS NOT NULL THEN coo ELSE 0::int END AS coo,&lt;br /&gt;
 CASE WHEN cio IS NOT NULL THEN cio ELSE 0::int END AS cio,&lt;br /&gt;
 CASE WHEN cto IS NOT NULL THEN cto ELSE 0::int END AS cto,&lt;br /&gt;
 CASE WHEN otherclvl IS NOT NULL THEN otherclvl ELSE 0::int END AS otherclvl,&lt;br /&gt;
 CASE WHEN boardmember IS NOT NULL THEN boardmember ELSE 0::int END AS boardmember,&lt;br /&gt;
 CASE WHEN president IS NOT NULL THEN president ELSE 0::int END AS president,&lt;br /&gt;
 CASE WHEN vp IS NOT NULL THEN vp ELSE 0::int END AS vp,&lt;br /&gt;
 CASE WHEN founder IS NOT NULL THEN founder ELSE 0::int END AS founder,&lt;br /&gt;
 CASE WHEN director IS NOT NULL THEN director ELSE 0::int END AS director&lt;br /&gt;
 FROM copeople&lt;br /&gt;
 LEFT JOIN titlelookup ON copeople.jobtitle=titlelookup.fulltitle;&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
Next we will try to identify whether an executive is male or female based on their names.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE namegender;&lt;br /&gt;
 CREATE TABLE namegender AS&lt;br /&gt;
 SELECT firstname, &lt;br /&gt;
 CASE WHEN countfemale &amp;gt; 0 AND countmale=0 THEN 1::int ELSE 0::int END AS exclusivelyfemale,&lt;br /&gt;
 CASE WHEN countmale &amp;gt; 0 AND countfemale=0 THEN 1::int ELSE 0::int END AS exclusivelymale&lt;br /&gt;
 FROM&lt;br /&gt;
 	(SELECT firstname, COALESCE(sum(titlefemale),0) as countfemale,  COALESCE(sum(titlemale),0) as countmale &lt;br /&gt;
 	FROM copeoplebase WHERE doctor=0&lt;br /&gt;
 	GROUP BY firstname) As T&lt;br /&gt;
 WHERE NOT (countfemale &amp;gt; 0 AND countmale&amp;gt;0);&lt;br /&gt;
 --12736&lt;br /&gt;
&lt;br /&gt;
The next table expands CoPeopleBase to include information on executive gender and executive position.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleFull;&lt;br /&gt;
 CREATE TABLE CoPeopleFull AS&lt;br /&gt;
 SELECT copeoplebase.*,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1 THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender,&lt;br /&gt;
 CASE WHEN (ceo=1 OR president=1) THEN 1::int ELSE 0::int END AS ceopres,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1) THEN 1::int ELSE 0::int END AS CLevel,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1) THEN 1::int ELSE 0::int END AS board,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1 OR vp=1 OR founder=1) THEN 1::int ELSE &lt;br /&gt;
 0::int END AS vpandabove&lt;br /&gt;
 FROM copeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=copeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --177547&lt;br /&gt;
&lt;br /&gt;
The next table only keeps executive listings that have a valid portco primary key associated with them. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleKey;&lt;br /&gt;
 CREATE TABLE CoPeopleKey AS&lt;br /&gt;
 SELECT A.*&lt;br /&gt;
 FROM CoPeopleFull AS A&lt;br /&gt;
 JOIN (SELECT firstname, lastname, cname, datefirstinv, statecode, count(*) FROM CoPeopleFull &lt;br /&gt;
 WHERE firstname IS NOT NULL AND lastname IS NOT NULL AND cname IS NOT NULL AND datefirstinv IS NOT NULL AND statecode IS NOT NULL&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode HAVING COUNT(*)=1) AS B&lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The next table identifies whether a person previously held executive positions.&lt;br /&gt;
&lt;br /&gt;
 CREATE TABLE CoPeopleSerial AS&lt;br /&gt;
 SELECT firstname, lastname, cname, datefirstinv, statecode, &lt;br /&gt;
 COALESCE(sum(hasperson),0) as prev,&lt;br /&gt;
 COALESCE(sum(ceo),0) as prevceo,&lt;br /&gt;
 COALESCE(sum(ceopres),0) as prevceopres,&lt;br /&gt;
 COALESCE(sum(founder),0) as prevfounder,&lt;br /&gt;
 COALESCE(sum(clevel),0) as prevclevel,&lt;br /&gt;
 COALESCE(sum(board),0) as prevboard,&lt;br /&gt;
 COALESCE(sum(vpandabove),0) as prevvpandabove,&lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serial,&lt;br /&gt;
 CASE WHEN sum(ceo) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceo,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialvpandabove&lt;br /&gt;
 FROM (&lt;br /&gt;
 	SELECT A.prefix, A.firstname, A.lastname, A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 	B.cname as prevcname, B.datefirstinv as prevdatefirstinv, B.statecode as prevstatecode, B.ceo, B.ceopres, B.founder, B.clevel, B.board, B.vpandabove, B.hasperson&lt;br /&gt;
 	FROM CoPeopleKey AS A&lt;br /&gt;
 	LEFT JOIN CoPeopleKey AS B ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv &amp;gt; B.datefirstinv&lt;br /&gt;
 ) AS T&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The last table aggregates a ton of information on executives for each company. There is too much information to explain it all. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeopleagg;&lt;br /&gt;
 CREATE TABLE copeopleagg AS&lt;br /&gt;
 SELECT A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 sum(hasperson) as numperson,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasvpandabove,&lt;br /&gt;
 sum(female) as females,&lt;br /&gt;
 sum(male) as males,&lt;br /&gt;
 sum(unknowngender) as ugs,&lt;br /&gt;
 sum(doctor*female) as femaledoctors,&lt;br /&gt;
 sum(doctor*male) as maledoctors,&lt;br /&gt;
 sum(doctor*unknowngender) as ugdoctors,&lt;br /&gt;
 sum(ceopres*female) as femaleceos,&lt;br /&gt;
 sum(ceopres*male) as maleceos,&lt;br /&gt;
 sum(ceopres*unknowngender) as ugceos,&lt;br /&gt;
 sum(ceopres*female*doctor) as femaledoctorceos,&lt;br /&gt;
 sum(ceopres*male*doctor) as maledoctorceos,&lt;br /&gt;
 sum(ceopres*unknowngender*doctor) as ugdoctorceos,&lt;br /&gt;
 sum(founder*female) as femalefounders,&lt;br /&gt;
 sum(founder*male) as malefounders,&lt;br /&gt;
 sum(founder*unknowngender) as ugfounders,&lt;br /&gt;
 sum(founder*female*doctor) as femaledoctorfounders,&lt;br /&gt;
 sum(founder*male*doctor) as maledoctorfounders,&lt;br /&gt;
 sum(founder*unknowngender*doctor) as ugdoctorfounders,&lt;br /&gt;
 sum(clevel*female) as femaleclevels,&lt;br /&gt;
 sum(clevel*male) as maleclevels,&lt;br /&gt;
 sum(clevel*unknowngender) as ugclevels,&lt;br /&gt;
 sum(clevel*female*doctor) as femaledoctorclevels,&lt;br /&gt;
 sum(clevel*male*doctor) as maledoctorclevels,&lt;br /&gt;
 sum(clevel*unknowngender*doctor) as ugdoctorclevels,&lt;br /&gt;
 sum(board*female) as femaleboards,&lt;br /&gt;
 sum(board*male) as maleboards,&lt;br /&gt;
 sum(board*unknowngender) as ugboards,&lt;br /&gt;
 sum(board*female*doctor) as femaledoctorboards,&lt;br /&gt;
 sum(board*male*doctor) as maledoctorboards,&lt;br /&gt;
 sum(board*unknowngender*doctor) as ugdoctorboards,&lt;br /&gt;
 sum(vpandabove*female) as femaleabovevps,&lt;br /&gt;
 sum(vpandabove*male) as maleabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender) as ugabovevps,&lt;br /&gt;
 sum(vpandabove*female*doctor) as femaledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*male*doctor) as maledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender*doctor) as ugdoctorabovevps,&lt;br /&gt;
 sum(prev*female) as femaleprevs,&lt;br /&gt;
 sum(prev*male) as maleprevs,&lt;br /&gt;
 sum(prev*unknowngender) as ugprevs,&lt;br /&gt;
 sum(prevceopres*female) as femaleprevceopres,&lt;br /&gt;
 sum(prevceopres*male) as maleprevceopres,&lt;br /&gt;
 sum(prevceopres*unknowngender) as ugprevceopres,&lt;br /&gt;
 sum(prevfounder*female) as femaleprevfounder,&lt;br /&gt;
 sum(prevfounder*male) as maleprevfounder,&lt;br /&gt;
 sum(prevfounder*unknowngender) as ugprevfounder,&lt;br /&gt;
 sum(prevclevel*female) as femaleprevclevel,&lt;br /&gt;
 sum(prevclevel*male) as maleprevclevel,&lt;br /&gt;
 sum(prevclevel*unknowngender) as ugprevclevel,&lt;br /&gt;
 sum(prevboard*female) as femaleprevboard,&lt;br /&gt;
 sum(prevboard*male) as maleprevboard,&lt;br /&gt;
 sum(prevboard*unknowngender) as ugprevboard,&lt;br /&gt;
 sum(prevvpandabove*female) as femaleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*male) as maleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*unknowngender) as ugprevvpandabove,&lt;br /&gt;
 sum(serial*female) as femaleserials,&lt;br /&gt;
 sum(serial*male) as maleserials,&lt;br /&gt;
 sum(serial*unknowngender) as ugserials,&lt;br /&gt;
 sum(serialceopres*female) as femaleserialceopres,&lt;br /&gt;
 sum(serialceopres*male) as maleserialceopres,&lt;br /&gt;
 sum(serialceopres*unknowngender) as ugserialceopres,&lt;br /&gt;
 sum(serialfounder*female) as femaleserialfounder,&lt;br /&gt;
 sum(serialfounder*male) as maleserialfounder,&lt;br /&gt;
 sum(serialfounder*unknowngender) as ugserialfounder,&lt;br /&gt;
 sum(serialclevel*female) as femaleserialclevel,&lt;br /&gt;
 sum(serialclevel*male) as maleserialclevel,&lt;br /&gt;
 sum(serialclevel*unknowngender) as ugserialclevel,&lt;br /&gt;
 sum(serialboard*female) as femaleserialboard,&lt;br /&gt;
 sum(serialboard*male) as maleserialboard,&lt;br /&gt;
 sum(serialboard*unknowngender) as ugserialboard,&lt;br /&gt;
 sum(serialvpandabove*female) as femaleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*male) as maleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*unknowngender) as ugserialvpandabove,&lt;br /&gt;
 sum(ceopres*serialceopres*female) as femaleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*male) as maleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*unknowngender) as ugceopresserialceopres,&lt;br /&gt;
 sum(founder*serialfounder*female) as femalefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*male) as malefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*unknowngender) as ugfounderserialfounder &lt;br /&gt;
 FROM CoPeoplekey AS A&lt;br /&gt;
 JOIN CoPeopleSerial AS B &lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode&lt;br /&gt;
 GROUP BY A.cname, A.datefirstinv, A.statecode;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
Since this table is so big, it is a good idea to have a smaller, more manageable table to work with. &lt;br /&gt;
&lt;br /&gt;
DROP TABLE copeopleaggsimple;&lt;br /&gt;
 CREATE TABLE copeopleaggsimple AS&lt;br /&gt;
 SELECT cname, datefirstinv, statecode, numperson, females, males, ugs, ugdoctors, maleserials+femaleserials+ugserials AS serials&lt;br /&gt;
 FROM copeopleagg;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
===Fund People===&lt;br /&gt;
Luckily, this process is much easier than the company people process. First we must simply load the data into the db.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeople;&lt;br /&gt;
 CREATE TABLE fundpeople(&lt;br /&gt;
 	fundname  varchar(255),&lt;br /&gt;
 	fundyear  int,&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	 prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundpeople FROM 'Executives-Funds-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table identifies degree and sex information about the executives of the fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeoplebase;&lt;br /&gt;
 CREATE TABLE fundpeoplebase AS&lt;br /&gt;
 SELECT fundpeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson&lt;br /&gt;
 FROM fundpeople;&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table tries to identify the sex of the executive using the above defined namegender table. It only selects rows where a person is actually listed.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleFull;&lt;br /&gt;
 CREATE TABLE FundPeopleFull AS&lt;br /&gt;
 SELECT fundpeoplebase.*, exclusivelyfemale, exclusivelymale,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 AND exclusivelymale=0 AND (titlemale=0 OR titlemale IS NULL) THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1  AND exclusivelyfemale=0 AND (titlefemale =0 OR titlefemale IS NULL) THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender&lt;br /&gt;
 FROM fundpeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=fundpeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --320915&lt;br /&gt;
&lt;br /&gt;
The next table gives you information on executives aggregated by fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleAgg;&lt;br /&gt;
 CREATE TABLE FundPeopleAgg AS&lt;br /&gt;
 SELECT fundname, &lt;br /&gt;
 sum(female) as numfemale,&lt;br /&gt;
 sum(male) as nummale,&lt;br /&gt;
 sum(unknowngender) as numunknowngender,&lt;br /&gt;
 sum(doctor) as numdoctor,&lt;br /&gt;
 sum(female*doctor) as numfemaledoctor,&lt;br /&gt;
 sum(male*doctor) as nummaledoctor,&lt;br /&gt;
 sum(unknowngender*doctor) as numunknowngenderdoctor,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 sum(hasperson) as numpeople, &lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt; 0 THEN sum(female)/sum(hasperson) ELSE NULL END as fracfemale,&lt;br /&gt;
 CASE WHEN sum(male) &amp;gt; 0 THEN sum(female)/sum(male) ELSE NULL END as ratiofemale&lt;br /&gt;
 FROM FundPeopleFull&lt;br /&gt;
 GROUP BY fundname;&lt;br /&gt;
 --21536&lt;br /&gt;
&lt;br /&gt;
It is also good to have this information on firms. We do not pull firm people information from SDC. However, we have enough information to create it from preexisting tables.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmpeopleagg;&lt;br /&gt;
 CREATE TABLE firmpeopleagg AS &lt;br /&gt;
 SELECT _firmname as firmname, sum(numfemale) as firmwomen, sum(nummale) as firmmen, sum(numunknowngender) as firmugs, &lt;br /&gt;
 sum(numdoctor) as firmdoctors, sum(numpeople) as firmpeople,&lt;br /&gt;
 CASE WHEN sum(numpeople) &amp;gt; 0 THEN (sum(numfemale)/sum(numpeople))::real ELSE NULL END as firmfracwomen,&lt;br /&gt;
 CASE WHEN sum(nummale) &amp;gt; 0 THEN (sum(numfemale)/sum(nummale))::real ELSE NULL END as firmratiowomen&lt;br /&gt;
 FROM roundlineaggfunds AS A&lt;br /&gt;
 JOIN fundpeopleagg AS B ON A._fundname=B.fundname&lt;br /&gt;
 GROUP BY _firmname;&lt;br /&gt;
 --5273&lt;br /&gt;
&lt;br /&gt;
==Marcos's Code==&lt;br /&gt;
This is code that a Rice student, Marcos Lee, wrote. I cleaned it and ran it. I have described the tables that I built and where they come from below. My code is located in:&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\MatchingEntrepsV3&lt;br /&gt;
&lt;br /&gt;
If you have issues understanding my explanation, go to this location and read the query. Most of them are straight forward. &lt;br /&gt;
===Describing Stacks Created in Code===&lt;br /&gt;
 CoPeopleBase:&lt;br /&gt;
 -Builds from copeople and titlelookup&lt;br /&gt;
 -Identifies what roles people played in their companies&lt;br /&gt;
&lt;br /&gt;
 namegender:&lt;br /&gt;
 -built from copeoplebase&lt;br /&gt;
 -identifies male/female/unknown&lt;br /&gt;
&lt;br /&gt;
 CoPeopleFull:&lt;br /&gt;
 -built from copeoplebase and namegender&lt;br /&gt;
 -builds more extensive information on executive including speficially what level of executive they are&lt;br /&gt;
&lt;br /&gt;
 CoPeopleKey:&lt;br /&gt;
 -built from CoPeopleFull&lt;br /&gt;
 -creates table where only executives with full primary keys are kept&lt;br /&gt;
&lt;br /&gt;
 CoPeopleSerial:&lt;br /&gt;
 -built from copeoplekey&lt;br /&gt;
 -keeps track of executives previous jobs at executive level&lt;br /&gt;
&lt;br /&gt;
 CoPoepleAgg:&lt;br /&gt;
 -built from copeoplekey and copeopleserial&lt;br /&gt;
 -gets extensive information on executives for each company&lt;br /&gt;
&lt;br /&gt;
 FundPeopleBae:&lt;br /&gt;
 -built from fundpeople&lt;br /&gt;
 -identifies male/female/doctor&lt;br /&gt;
 -hasperson column slightly weird because we can only have the lastname without prefix or first name and still have a 1 in column. Seems to be of little use/too broad&lt;br /&gt;
&lt;br /&gt;
 FundPeopleFull:&lt;br /&gt;
 -built from fundpeoplebase, namegender&lt;br /&gt;
 -adds in male/female &lt;br /&gt;
&lt;br /&gt;
 Fundpeopleagg:&lt;br /&gt;
 -built from fundpeoplefull&lt;br /&gt;
 -has aggregations of gender info for each fund&lt;br /&gt;
&lt;br /&gt;
 RoundLineJoinerLeanffWlistno:&lt;br /&gt;
 -built from rounlinejoinerleanff&lt;br /&gt;
 -adds listno to funds&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggFunds:&lt;br /&gt;
 -built from roundlinejoinerleanffwlistno and rounlineaggfirms&lt;br /&gt;
 -if there are two funds from one firm that invest in same portco, we choose only one and leave the others behind&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggWExit:&lt;br /&gt;
 -built from roundlineaggfirms, portcoexitupdated, roundlineaggfunds&lt;br /&gt;
 -adds in exit information for each company in roundlineaggfirms&lt;br /&gt;
&lt;br /&gt;
 FirmPerf:&lt;br /&gt;
 -built from roundlineaggwexit&lt;br /&gt;
 -adds in various performance measures for a given firm &lt;br /&gt;
&lt;br /&gt;
 PortCoFundDemo:&lt;br /&gt;
 -built from roundlinejoinerleanffclean and fundpeopleagg&lt;br /&gt;
 -gives information on executives of funds who invested in the portcos&lt;br /&gt;
&lt;br /&gt;
 PortCoPeopleMaster:&lt;br /&gt;
 -built from PortCoMaster, PortCoIndustry, PortCoPatent, PortCoSBIR, copeoplagg, PortCoFundDemo, CPI, statelookupint&lt;br /&gt;
 -huge amount of data about companies and their executives&lt;br /&gt;
&lt;br /&gt;
 RoundAggDistBase:&lt;br /&gt;
 -built from portcogeo, firmbogeo, roundlineaggwexit&lt;br /&gt;
 -creates geographic points using long, lat from geocoding&lt;br /&gt;
&lt;br /&gt;
 RoundAggDist:&lt;br /&gt;
 -Built from roundaggdistbase&lt;br /&gt;
 -gets actual distances between portcos and firms. if branch office exists and distance is less than distance to firm chooses that also generates random number&lt;br /&gt;
&lt;br /&gt;
 FirmPeopleAgg:&lt;br /&gt;
 -built from roundlineaggfunds, fundpeopleagg&lt;br /&gt;
 -finds information on executives from different firms&lt;br /&gt;
&lt;br /&gt;
 PortCoMatchmaster:&lt;br /&gt;
 -built from portcopatent, porcoindustry, portcosbir, copeopleaggsimple, portcoid&lt;br /&gt;
 -gets all information together about portcos&lt;br /&gt;
&lt;br /&gt;
 FirmMatchMaster:&lt;br /&gt;
 -built from firmperf, firmvars, firmpeopleagg, firmid&lt;br /&gt;
 -gets all information together about firms&lt;br /&gt;
&lt;br /&gt;
 RoundLineMasterBase:&lt;br /&gt;
 -built from portcomatchmaster, firmmatchmaster, roundaggdist, roundlineaggwexit&lt;br /&gt;
 -builds large amount of information about portcos and firms spceifically info about exits and distances&lt;br /&gt;
&lt;br /&gt;
 MatchMostNumerous:&lt;br /&gt;
 -built from roundlinemasterbase&lt;br /&gt;
 -finds max number of portcos invested in by a firm that also invested in the company grouping by&lt;br /&gt;
&lt;br /&gt;
 MatchHighestRandom:&lt;br /&gt;
 -built from matchmostnumerous&lt;br /&gt;
 -if two firms that invested in one company had the same number of max port cos this randomly chooses one company&lt;br /&gt;
&lt;br /&gt;
 FirmActiveYearsCode20:&lt;br /&gt;
 -built from roundlinejoinerleanffclean, porcoindustry&lt;br /&gt;
 -adds firmname to industry code not exactly sure why distinct is used in query&lt;br /&gt;
&lt;br /&gt;
 RealMatchesCode20:&lt;br /&gt;
 -built from MatchHighestRandom, PortCoIndustry&lt;br /&gt;
 -real matches between portcos and firms that invested in them including the code20&lt;br /&gt;
&lt;br /&gt;
 SyntheticFirmSetBaseCode20:&lt;br /&gt;
 -built from realmatchescode20, firmactiveyarscode20&lt;br /&gt;
 -crossproduct of firms and portcos. finds firms that invested in same year as portco received first inv, firms invested in same type of company, and makes sure matches are unique&lt;br /&gt;
&lt;br /&gt;
 AllMatchKeys:&lt;br /&gt;
 -built from SyntheticFirmSetBaseCode20, RealMatchesCode20&lt;br /&gt;
 -combines synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAggDistBaseCode20:&lt;br /&gt;
 -built from allmatchkeys, portcogeo, firmbogeo&lt;br /&gt;
 -builds points for all portco, firm listings in allmatch keys&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAddDistCode20:&lt;br /&gt;
 -built from synthroundaggdistvasecode20&lt;br /&gt;
 -finds actual distance between portcos and firms using installed extensions chooses branch offices if distance between portco and bo less than firm&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameInduBlowoutCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -gets every firm combination and checks whehter the companies that those firms invested in are in the same general industry&lt;br /&gt;
&lt;br /&gt;
 SynthFirmNameroundInduHistCode20:&lt;br /&gt;
 -built from SynthFirmnameInduBlowoutcode20&lt;br /&gt;
 -gets information by portco, firmname match about what the firms past investment patterns are&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthBaseCode20Portco:&lt;br /&gt;
 -built from Allmatchkeys, matchhighestrandom, synthroundaggdistcode20, sythnfirmnameroundinduhistcode20, synthfirmnameroundindutotalcode20, firmvars, copeopleaggsimple, portcomaster&lt;br /&gt;
 -builds a bunch of information about synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameRoundInduTotalCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -finds number of portcos in certain industries by firmnames&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthCode20Firms:&lt;br /&gt;
 -built with firmmatchmaster, allmatchkeys&lt;br /&gt;
 -matching a bunch of information to all firms&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthcode20:&lt;br /&gt;
 -built from masterwithsynthbasecode20portco, masterwithsynthcode20firms&lt;br /&gt;
 -gets a huge amount of info together on real and synthetic matches about firms and companies&lt;br /&gt;
&lt;br /&gt;
 MasterReals:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just real matches from code&lt;br /&gt;
&lt;br /&gt;
 MasterOneSynth:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just one randomly chosen synthetic match between companies and firms&lt;br /&gt;
&lt;br /&gt;
 MasterRealOneSynth:&lt;br /&gt;
 -built from masteronesynth, masterreals&lt;br /&gt;
 -combines the real and one synth table&lt;br /&gt;
&lt;br /&gt;
==Ranking Tables and Graphs==&lt;br /&gt;
This is a slight detour from the creation of VCDB3. However, this is a cool process because you actually get to use the data you've been working with. This process is extensive, but the queries are easy to understand. If you wish to have deeper understanding of the process, read the code. It is located in:&lt;br /&gt;
&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\RoundRanking.SQL&lt;br /&gt;
&lt;br /&gt;
First you must create a table that has aggregate round information grouped by cities and round year. Since this is a little difficult to picture, I will attach the code.&lt;br /&gt;
 DROP TABLE roundleveloutput;&lt;br /&gt;
 CREATE TABLE roundleveloutput AS SELECT&lt;br /&gt;
 city, statecode, roundyear AS year,&lt;br /&gt;
 SUM(rndamtestm*seedflag) AS seedamnt,&lt;br /&gt;
 SUM(rndamtestm*earlyflag) AS earlyamnt,&lt;br /&gt;
 SUM(rndamtestm*laterflag) AS lateramnt,&lt;br /&gt;
 SUM(rndamtestm*growthflag) AS selamnt,&lt;br /&gt;
 SUM(growthflag*dealflag) AS numseldeals&lt;br /&gt;
 FROM round GROUP BY city, statecode, roundyear;&lt;br /&gt;
 --30028&lt;br /&gt;
&lt;br /&gt;
Next create a table that lists the all time SEL amount by city. Keep including the state code since this will ensure that you have the right city. City names are often repeated in different states. Next, create a table which lists unique city, state for every year since 1980. Then, build a table which matches portcos to the city, state, year blowout table for each year they were alive. This table should be relatively large since it lists companies once for every year they were alive up until the present. Then create a table that displays the number of companies alive in a city every year since 1980.  Then add in a table that lists all of the information you have built in tables previously based on city, state, year. Also add in population. Then you can run the ranking queries.&lt;br /&gt;
&lt;br /&gt;
For states follow the same general process but group by states not cities and states. &lt;br /&gt;
&lt;br /&gt;
If this explanation was not enough for you (it was not meant to be in depth) go to the location defined above and read the actual code. With the description I have given, you should be able to piece together what each query does.&lt;br /&gt;
&lt;br /&gt;
==Master Tables==&lt;br /&gt;
Throughout the creation of the database, there are inevitably some tables that are vital to create a solid foundation. The following tables are the master tables with a quick explanation:&lt;br /&gt;
* '''Companybasecore'''- The base table for portcos. This is data that was drawn directly from SDC and was not changed other than for cleaning purposes. Count: 48001&lt;br /&gt;
* '''BranchOfficeCore'''- The base table for branch offices. This is data drawn directly from SDC. Here only branch offices with distinct firm names are included. Count: 10032&lt;br /&gt;
* '''FirmBaseCore'''- The base table for firms. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 15437&lt;br /&gt;
* '''FundBaseCore'''- The base table for funds. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 28833&lt;br /&gt;
* '''IPOCleanNoDups''' - This is the clean table of IPOs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 2136&lt;br /&gt;
* '''IPONoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here. Be careful using this table. Count: 11149&lt;br /&gt;
* '''MACleanNoDups'''- This is the clean table of MAs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 7171&lt;br /&gt;
* '''MANoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here as well. Be careful using this table. Count: 119374&lt;br /&gt;
* '''Round'''- This is the master round table. It has SEL flags attached to it and has the most round info. RoundBaseClean is also a decent table but has less information. This table is your best bet for round information. Count: 151323&lt;br /&gt;
* '''RoundLineJoinerLeanFFClean'''- This is the master round table for joining purposes. It was cleaned and used for widespread joining purposes. Count: 163157&lt;br /&gt;
* '''CoPeople'''- This is the base table for PortCo people information. It was pulled directly from SDC. Count: 194359&lt;br /&gt;
* '''FirmBoGeo'''- This is the base table for firm/branch office geocoding. This table was cleaned and contains lat/long readings for firms and branch offices where the information was available. Count: 15437&lt;br /&gt;
* '''PortCoGeo'''- This is the base table for portco geocoding. Table was cleaned and contains lat/long reading for portcos where the Google API returned a valid reading. Count: 48001&lt;br /&gt;
* '''FirmPerf'''- This is a wide reaching table about the performance of firms. It was mainly used later in the project but is extremely useful. Count: 8336&lt;br /&gt;
* '''FundPeople'''- This is the base table for fund people information. It was pulled directly from SDC. Count: 328994.&lt;br /&gt;
* '''PortCoExitUpdated'''- This is the master exit table for portcos. The difference between this and PortCoExit is that Updated has two columns marking MAs and IPOs while the other has one column MAvsIPO. Use which ever one is more convenient. Count: 48001&lt;br /&gt;
* '''PortCoMaster'''- This table is great. There's a ton of information on PortCos including SEL flags, round amounts, and industry classifications. Count: 48001&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48670</id>
		<title>VentureXpert Data</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48670"/>
		<updated>2025-01-16T16:13:18Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|Has project output=Data,How-to&lt;br /&gt;
|Has sponsor=McNair Center&lt;br /&gt;
|Has title=VentureXpert Data&lt;br /&gt;
|Has owner=Augi Liebster,&lt;br /&gt;
|Has start date=June 20, 2018&lt;br /&gt;
|Has project status=Active&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
The successors to this project include:&lt;br /&gt;
*[[VCDB24]], which is the most recent iteration.&lt;br /&gt;
*[[VCDB23]]&lt;br /&gt;
*[[VCDB20Q3]]&lt;br /&gt;
*[[VCDB20H1]]&lt;br /&gt;
*[[Vcdb4]]&lt;br /&gt;
&lt;br /&gt;
==Relevant Former Projects==&lt;br /&gt;
#[[Venture Capital (Data)]]&lt;br /&gt;
#[[Retrieving US VC Data From SDC]]&lt;br /&gt;
#[[VC Database Rebuild]]&lt;br /&gt;
&lt;br /&gt;
==Location==&lt;br /&gt;
My scripts for SDC pulls are located in the E drive in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ScriptsForSDCExtract&lt;br /&gt;
&lt;br /&gt;
My successfully pulled and normalized files are stored in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ExtractedDataQ2&lt;br /&gt;
&lt;br /&gt;
My scripts for loading tables and data are in:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\vcdb3\LoadingScripts&lt;br /&gt;
&lt;br /&gt;
There are a variety of SQL files in there with self explanatory names. The file that has all of the loading scripts is called LoadingScriptsV1. The folder vcdb2 is there for reference to see what people before had done. ExtractedData is there because I pulled data before July 1st, and Ed asked me to repull the data.&lt;br /&gt;
&lt;br /&gt;
==Goal==&lt;br /&gt;
I will be looking to redesign the VentureXpert Database in a way that is more intuitively built than the previous one. I will also update the database with current data.&lt;br /&gt;
&lt;br /&gt;
==Initial Stages==&lt;br /&gt;
The first step of the project was to figure out what primary keys to use for each major table that I create. I looked at the primary keys used in the creation of the [[VC Database Rebuild]] and found primary keys that are decent. I have updated them and list them below:&lt;br /&gt;
&lt;br /&gt;
#CompanyBaseCore- coname, statecode, datefirstinv&lt;br /&gt;
#IPOCore- issuer, issuedate, statecode&lt;br /&gt;
#MACore- target name, target state code, announceddate&lt;br /&gt;
#Geo - city, statecode, coname, datefirst, year&lt;br /&gt;
#DeadDate - conname, statecode, datefirst, rounddate (tentative could still change)&lt;br /&gt;
#RoundCore- conname, statecode, datefirst, rounddate&lt;br /&gt;
#FirmBaseCore - firmname&lt;br /&gt;
#FundBaseCore - fund name (firstinvedate doesn't work because not every row has an entry)&lt;br /&gt;
&lt;br /&gt;
These are my initial listings and I will come back to update them if needed. &lt;br /&gt;
&lt;br /&gt;
The second part of the initial stage has been to pull data from the SDC Platinum platform. I did it in July to ensure that I had two full quarters of data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SDC Pull==&lt;br /&gt;
When pulling data from SDC, it is a good idea to look for previously made rpt files that have the names of the pulls you will need to do. They have already been created and will save you a lot of work. The rpt files that I used are in the folder VentureXpertDB/ScriptsForSDCExtract. The files will come in pairs with one being saved as an ssh file and one as a rpt file. To update the dates to make them recent, go into the ssh file of the pair and change the date of last investment. When you open SDC, you will be given a variety of choices for which database to pull from. For each type of file chose the following:&lt;br /&gt;
&lt;br /&gt;
#VentureXpert - PortCo, PortCoLong, USVC, Firms, BranchOffices, Funds, Rounds, VCFirmLong&lt;br /&gt;
#Mergres &amp;amp; Acquisition - MAs&lt;br /&gt;
#Global New Issues Databases - IPOs&lt;br /&gt;
&lt;br /&gt;
Help on pulling data from SDC is on the [[SDC Platinum (Wiki)]] page. &lt;br /&gt;
&lt;br /&gt;
===VCFund Pull Problem===&lt;br /&gt;
When pulling the VCFund1980-Present, I encountered two problems. One, is that SDC is not able to sort through the funds that are US only with the built in filters. Two, there are multiple rpt files that specify different variables for the fund pull. I pulled from both to be safe, but in the [[VC Database Rebuild]] page there is a section on the fund pull where Ed specifies which rpt file he used to pull data from SDC. Regardless I have both saved in the ExtractedData folder. After speaking with Ed, he told me to use the VCFund1980-present.rpt file to extract the data. Had various problems extracting data including freezing of SDC program or getting error Out of Memory. Check the [[SDC Platinum (Wiki)]] page to fix these issues.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Loading Tables==&lt;br /&gt;
When I describe errors I encountered, I will not describe them using line numbers. This is because as soon as any data is added, the line numbers will become useless. Instead I recommend that you copy the normalized file you are working with into an excel file and using the filter feature. This way you can find the line number in your specific file that is causing errors and fix it in the file itself. The line numbers that PuTTY errors display are often wrong, so I relied on excel to discover the error fastest. If my instructions are not enough for you to find the error, my advice would be to find key words in the line that PuTTY is telling you is causing errors and filter through excel.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundbase;&lt;br /&gt;
 CREATE TABLE roundbase (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   updateddate date,&lt;br /&gt;
   foundingdate date,&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   description varchar(5000),&lt;br /&gt;
   msa varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   indclass varchar(100),&lt;br /&gt;
   indsubgroup3 varchar(100),&lt;br /&gt;
   indminor varchar(100),&lt;br /&gt;
   url varchar(5000),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   stage1 varchar(100),&lt;br /&gt;
   stage3 varchar(100),&lt;br /&gt;
   rndamtdisck real,&lt;br /&gt;
   rndamtestk real,&lt;br /&gt;
   roundnum integer,&lt;br /&gt;
   numinvestors integer&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundbase FROM 'USVC1980-2018q2-Good.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --151549&lt;br /&gt;
&lt;br /&gt;
The only error I encountered here was with Cardtronic Technology Inc. Here there was a problem with a mixture of quotation marks which cause errors in loading. Find this using the excel trick and remove it manually.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipos;&lt;br /&gt;
 CREATE TABLE ipos (&lt;br /&gt;
   issuedate date,&lt;br /&gt;
   issuer varchar(255),&lt;br /&gt;
   statecode varchar(10), &lt;br /&gt;
   principalamt money, --million&lt;br /&gt;
   proceedsamt money, --sum of all markets in million&lt;br /&gt;
   naiccode varchar(255), --primary NAIC code&lt;br /&gt;
   zipcode varchar(10),&lt;br /&gt;
   status varchar (20),&lt;br /&gt;
   foundeddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY ipos FROM 'IPO1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --12107&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchoffices;&lt;br /&gt;
 CREATE TABLE branchoffices (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   bocity varchar(100),&lt;br /&gt;
   bostate varchar(2),&lt;br /&gt;
   bocountrycode varchar(2),&lt;br /&gt;
   bonation varchar(100),&lt;br /&gt;
   bozip varchar(10),&lt;br /&gt;
   boaddr1 varchar(100),&lt;br /&gt;
   boaddr2 varchar(100)&lt;br /&gt;
 &lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY branchoffices FROM 'USVCFirmBranchOffices1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundline;&lt;br /&gt;
 CREATE TABLE roundline (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   disclosedamt real,&lt;br /&gt;
   fundname varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundline FROM 'USVCRound1980-2018q2-NoFoot-normal-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --403189&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbase;&lt;br /&gt;
 CREATE TABLE fundbase (&lt;br /&gt;
   fundname varchar(255),&lt;br /&gt;
   closedate date, --mm-dd-yyyy&lt;br /&gt;
   lastinvdate date, --mm-dd-yyyy&lt;br /&gt;
   firstinvdate date, --mm-dd-yyyy&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   fundyear varchar(4), --yyyy&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   fundsizem real,&lt;br /&gt;
   fundstage varchar(100),&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   dateinfoupdate date,&lt;br /&gt;
   invtype varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   raisestatus varchar(100),&lt;br /&gt;
   seqnum integer,&lt;br /&gt;
   targetsizefund real,&lt;br /&gt;
   fundtype varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundbase FROM 'VCFund1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --29397&lt;br /&gt;
&lt;br /&gt;
There is a Ukranian fund that has stray quotation marks in its name. It is called something along the lines of &amp;quot;VAT &amp;quot;ZNVKIF &amp;quot;Skhidno-Evropeis'lyi investytsiynyi Fond&amp;quot;. If this does not help, you can filter in excel using Kiev as the keyword in the city column and find the line where you are getting errors. Then manually remove the commas in the actual text file. After that, the table should load correctly.&lt;br /&gt;
 DROP TABLE firmbase;&lt;br /&gt;
 CREATE TABLE firmbase(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   foundingdate date, --mm-dd-yyyy&lt;br /&gt;
   datefirstinv date, --mm-dd-yyyy  &lt;br /&gt;
   datelastinv date, --mm-dd-yyyy&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   location varchar(100),&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   areacode integer,&lt;br /&gt;
   county varchar(100),&lt;br /&gt;
   state varchar(2),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   nation varchar(100),&lt;br /&gt;
   worldregion varchar(100),&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   numrounds integer,&lt;br /&gt;
   investedk money,&lt;br /&gt;
   capitalundermgmt money,  &lt;br /&gt;
   invstatus varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   rolepref varchar(100),&lt;br /&gt;
   geogpref varchar(100),&lt;br /&gt;
   indpref varchar(100),&lt;br /&gt;
   stagepref varchar(100),&lt;br /&gt;
   type varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 \COPY firmbase FROM 'USVCFirms1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --15899&lt;br /&gt;
&lt;br /&gt;
The normalization for this file was wrong when I tried to load the data. To fix this go to the file where you have removed the footer and find the column header titled Firm Capital under Mgmt{0Mil}. Delete the {0mil} and renormalize the file. Then everything should be ok. A good way to check this is to copy and paste the normalized file into an excel sheet and see whether the entries line up with their column header correctly. &lt;br /&gt;
The second error I found was with the Kerala Ventures firm. Here the address has the word l&amp;quot;opera in it. This quotation will cause errors so find the line number using excel and remove it manually.&lt;br /&gt;
The third error is in an area code where 1-8 is written. This hyphen causes errors. Interestingly, the line number given by PuTTY was correct, and I found it in my text file and deleted it manually.&lt;br /&gt;
These were the only errors I encountered while loading this table.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE mas;&lt;br /&gt;
 CREATE TABLE mas (&lt;br /&gt;
   announceddate date,&lt;br /&gt;
   effectivedate date,&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   targetstate varchar(100),&lt;br /&gt;
   acquirorname varchar(255),&lt;br /&gt;
   acquirorstate varchar(100),&lt;br /&gt;
   transactionamt money,&lt;br /&gt;
   enterpriseval varchar(255),&lt;br /&gt;
   acquirorstatus varchar(150)&lt;br /&gt;
 );&lt;br /&gt;
 \COPY mas FROM 'MAUSTargetComp100pc1985-July2018-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --119432&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading in this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE longdescription;&lt;br /&gt;
 CREATE TABLE longdescription(&lt;br /&gt;
   varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   fundingdate date, --date co received first inv&lt;br /&gt;
   codescription varchar(10000) --long description&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY longdescription FROM 'PortCoLongDesc-Ready-normal-fixed.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --48037&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading this data.&lt;br /&gt;
&lt;br /&gt;
==Cleaning Companybase, Fundbase, Firmbase, and BranchOffice==&lt;br /&gt;
===Cleaning Company===&lt;br /&gt;
The primary key for port cos will be coname, datefirstinv, and statecode. Before checking whether this is a valid primary key, remove the undisclosed companies. I will explain the second part of the query concerning New York Digital Health later. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE companybasecore;&lt;br /&gt;
 CREATE TABLE companybasecore AS&lt;br /&gt;
 SELECT * &lt;br /&gt;
 FROM Companybase WHERE nationcode = 'US' AND coname != 'Undisclosed Company' &lt;br /&gt;
 AND NOT (coname='New York Digital Health LLC' AND statecode='NY' AND datefirstinv='2015-08-13' AND updateddate='2015-10-20');&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT coname, statecode, datefirstinv FROM companybasecore) AS T;&lt;br /&gt;
 --48001&lt;br /&gt;
Since the count of the table and the count of the distinct primary key is equivalent, you know that the primary key is valid. In the initial cleaning of the table, I first sorted out only the undisclosed companies. This table had 48002 rows. I then ran the DISTINCT query above and found that there are 48001 distinct rows with the coname, datefirstinv, statecode primary key. Thus there must two rows that share a primary key. I found this key using the following query:&lt;br /&gt;
&lt;br /&gt;
 SELECT * FROM (SELECT coname, datefirstinv, statecode FROM companybase) as key GROUP BY coname, datefirstinv, statecode HAVING COUNT(key) &amp;gt; 1;&lt;br /&gt;
&lt;br /&gt;
The company named 'New York Digital Health LLC' came up as the company that is causing the problems. I queried to find the two rows that list this company name in companybase and chose to keep the row that had the earlier updated date. It is a good practice to avoid deleting rows from tables when possible, so I added the filter as a WHERE clause to exclude one of the New York Digital listings.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Fundbase===&lt;br /&gt;
The primary key for funds will be only the fundname. First get rid of all of the undisclosed funds. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasenound;&lt;br /&gt;
 CREATE TABLE fundbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM fundbase WHERE fundname NOT LIKE '%Undisclosed Fund%';&lt;br /&gt;
 --28886&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT fundname FROM fundbasenound)a;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
As you can see, fundbase still has rows that share fundnames. If you are wondering why the DISTINCT in the first query did not eliminate these, it is because this DISTINCT applies to the whole row not individual fundnames. Thus, only completely duplicate rows will be eliminated in the first query. I chose to keep the funds that have the earlier last investment date. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundups;&lt;br /&gt;
 CREATE TABLE fundups AS SELECT&lt;br /&gt;
 fundname, max(lastinvdate) AS lastinvdate FROM fundbasenound GROUP BY fundname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --53&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasecore;&lt;br /&gt;
 CREATE TABLE fundbasecore AS&lt;br /&gt;
 SELECT A.* FROM fundbasenound AS A LEFT JOIN fundups AS B ON A.fundname=B.fundname AND A.lastinvdate=B.lastinvdate WHERE B.fundname IS NULL AND B.lastinvdate IS NULL;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
Since the count of fundbasecore is the same as the number of distinct fund names, we know that the fundbasecore table is clean. In the first query I am finding duplicate rows and choosing the row that has the greater last investment date. I then match this table back to fundbasenound but choose all the rows from fundbasecore for which there is no corresponding fund in fundups based on fund name and date of last investment. This allows the funds with the earlier date of last investment to be chosen.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Firmbase===&lt;br /&gt;
The primary key for firms will be firm name. First I got rid of all undisclosed firms. I also filtered out two firms that have identical firm names and founding dates. The reason for this is because I use founding dates to filter out duplicate firm names. If there are two rows that have the same firm name and founding date, they will not be filtered out by the third query below. Thus, I chose to filter those out completely.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasenound;&lt;br /&gt;
 CREATE TABLE firmbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM firmbase WHERE firmname NOT LIKE '%Undisclosed Firm%' AND firmname NOT LIKE '%Amundi%' AND firmname NOT LIKE '%Schroder Adveq Management%';&lt;br /&gt;
 --15452&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM firmbasenound)a;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since these counts are not equal we will have to clean the table further. We will use the same method from before.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmdups;&lt;br /&gt;
 CREATE TABLE firmdups AS SELECT&lt;br /&gt;
 firmname, max(foundingdate) as foundingdate FROM firmbasenound GROUP BY firmname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --15&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasecore;&lt;br /&gt;
 CREATE TABLE firmbasecore AS&lt;br /&gt;
 SELECT A.* FROM firmbasenound AS A LEFT JOIN firmdups AS B ON A.firmname=B.firmname AND A.foundingdate=B.foundingdate WHERE B.firmname IS NULL AND B.foundingdate IS NULL;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since the count of firmbasecore and the DISTINCT query are the same, the firm table is now clean.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Branch Offices===&lt;br /&gt;
When cleaning the branch offices, I had to remove all duplicates in the table. This is because the table is so sparse that often the only data in a row would be the fund name the branch was associated with. Thus, I couldn't filter based on dates as I had been doing previously for firms and funds. The primary key is firm name.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bonound;&lt;br /&gt;
 CREATE TABLE bonound AS&lt;br /&gt;
 SELECT *, CASE WHEN firmname LIKE '%Undisclosed Firm%' THEN 1::int ELSE 0::int END AS undisclosedflag&lt;br /&gt;
 FROM branchoffices;&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM bonound)a;&lt;br /&gt;
 --10042&lt;br /&gt;
&lt;br /&gt;
Since these counts aren't the same, we will have to work a little more to clean the table. As stated above, I did this by excluding the firm names that were duplicated.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchofficecore;&lt;br /&gt;
 CREATE TABLE branchofficecore AS&lt;br /&gt;
 SELECT A.* FROM bonound AS A JOIN (&lt;br /&gt;
 		SELECT bonound.firmname, COUNT(*) FROM bonound GROUP BY firmname&lt;br /&gt;
 		HAVING COUNT(*) =1&lt;br /&gt;
 		) AS B&lt;br /&gt;
 ON A.firmname=B.firmname WHERE undisclosedflag=0;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT firmname FROM branchofficecore)a;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Since these counts are the same, we are good to go. The count is 10 lower because we completely removed 10 firmnames from the listing by throwing out the duplicates.&lt;br /&gt;
&lt;br /&gt;
==Instructions on Matching PortCos to Issuers and M&amp;amp;As From Ed==&lt;br /&gt;
===Company Standardizing===&lt;br /&gt;
&lt;br /&gt;
Get portco keys&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeys;&lt;br /&gt;
 CREATE TABLE portcokey AS&lt;br /&gt;
 SELECT coname, statecode, datefirst&lt;br /&gt;
 FROM portcocore;&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcocore OR THESE KEYS ARE VALID AND FIX THAT FIRST&lt;br /&gt;
&lt;br /&gt;
Get distinct coname and put it in a file&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT coname FROM portcokeys) TO 'DistinctConame.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
&lt;br /&gt;
Match that to itself&lt;br /&gt;
 Move DistinctConame.txt to E:\McNair\Software\Scripts\Matcher\Input&lt;br /&gt;
 Open powershell and change directory to E:\McNair\Software\Scripts\Matcher&lt;br /&gt;
 Run the matcher in mode2:&lt;br /&gt;
  perl Matcher.pl -file1=&amp;quot;DistinctConame.txt&amp;quot; -file2=&amp;quot;DistinctConame.txt&amp;quot; -mode=2&lt;br /&gt;
 Pick up the output file from E:\McNair\Software\Scripts\Matcher\Output (it is probably called DistinctConame.txt-DistinctConame.txt.matched) and move it to your Z drive directory&lt;br /&gt;
 &lt;br /&gt;
Load the matches into the dbase&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortcoStd;&lt;br /&gt;
 CREATE TABLE PortcoStd (&lt;br /&gt;
    conamestd  varchar(255),&lt;br /&gt;
    coname   varchar(255),&lt;br /&gt;
    norm  varchar(100),&lt;br /&gt;
    x1  varchar(255),&lt;br /&gt;
    x2  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY CohortCoStd FROM 'DistinctConame.txt-DistinctConame.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --YOUR COUNT&lt;br /&gt;
 &lt;br /&gt;
Join the Conamestd back to the portcokeys table to create your matching table&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeysstd;&lt;br /&gt;
 CREATE TABLE portcokeysstd AS&lt;br /&gt;
 SELECT B.conamestd, A.*&lt;br /&gt;
 FROM portcokey AS A&lt;br /&gt;
 JOIN PortcoStd AS B ON A.coname=B.coname&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcokey OR YOU LOST SOME NAMES OR INFLATED THE DATA&lt;br /&gt;
 &lt;br /&gt;
Put that in a file for matching (conamestd is in first column by construction)&lt;br /&gt;
&lt;br /&gt;
  \COPY portcokeysstd TO 'PortCoMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --YOUR COUNT&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===MA Cleaning and Matching===&lt;br /&gt;
First remove all of the duplicates in the MA data. Do this by running aggregate queries on every column except for the primary key:&lt;br /&gt;
 DROP TABLE MANoDups;&lt;br /&gt;
 CREATE TABLE MANoDups AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate, min(effectivedate) AS effectivedate, MIN(acquirorname) as acquirorname, MIN(acquirorstate) as acquirorstate, MAX(transactionamt) as &lt;br /&gt;
 transactionamt, MAX(enterpriseval) as enterpriseval, MIN(acquirorstatus) as acquirorstatus&lt;br /&gt;
 FROM mas &lt;br /&gt;
 GROUP BY targetname, targetstate, announceddate ORDER BY targetname, targetstate, announceddate;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM manodups)a;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
Since these counts are equivalent, the data set is clean. Then get all the primary keys from the table and copy the distinct target names into a text file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makey;&lt;br /&gt;
 CREATE TABLE makey AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate&lt;br /&gt;
 FROM manodups;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT targetname FROM makey) TO 'DistinctTargetName.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV;&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
After running this list of distinct target names through the matcher, put the standardized MA list into the data base.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MaStd;&lt;br /&gt;
 CREATE TABLE MaStd (&lt;br /&gt;
   targetnamestd varchar(255),&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   norm varchar(100),&lt;br /&gt;
   x1 varchar(255),&lt;br /&gt;
   x2 varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY mastd FROM 'DistinctTargetName.txt-DistinctTargetName.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
Then match the list of standardized names back to the makey table to get a table with standardized keys and primary keys. This will be your input for matching against port cos. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makeysstd;&lt;br /&gt;
 CREATE TABLE makeysstd AS&lt;br /&gt;
 SELECT B.targetnamestd, A.*&lt;br /&gt;
 FROM makey AS A&lt;br /&gt;
 JOIN mastd AS B ON A.targetname=B.targetname;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
  \COPY makeysstd TO 'MAMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --119374&lt;br /&gt;
&lt;br /&gt;
Use this text file to match against the PortCoMatchInput. Your job will be to determine whether the matches between the MAs and PortCos are true matches. The techniques that I used are described in the section below.&lt;br /&gt;
&lt;br /&gt;
===IPO Cleaning and Matching===&lt;br /&gt;
The process is the same for IPOs.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE iponodups;&lt;br /&gt;
 CREATE TABLE iponodups&lt;br /&gt;
 AS SELECT issuer, statecode, issuedate, MAX(principalamt) AS principalamt, MAX(proceedsamt) AS proceedsamt, MIN(naiccode) as naicode, MIN(zipcode) AS zipcode, MIN(status) AS status, &lt;br /&gt;
 MIN(foundeddate) AS foundeddate&lt;br /&gt;
 FROM ipos GROUP BY issuer, statecode, issuedate ORDER BY issuer, statecode, issuedate; &lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuer, statecode, issuedate FROM iponodups)a;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeys;&lt;br /&gt;
 CREATE TABLE ipokeys AS&lt;br /&gt;
 SELECT issuer, statecode, issuedate&lt;br /&gt;
 FROM iponodups;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT issuer FROM ipokeys) TO 'IPODistinctIssuer.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeysstd;&lt;br /&gt;
 CREATE TABLE ipokeysstd (&lt;br /&gt;
    issuerstd varchar(255),&lt;br /&gt;
    issuer varchar(255),&lt;br /&gt;
    norm varchar(100),&lt;br /&gt;
    x1 varchar(255),&lt;br /&gt;
    x2 varchar(255)&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY ipokeysstd FROM 'IPODistinctIssuer.txt-IPODistinctIssuer.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipostd;&lt;br /&gt;
 CREATE TABLE ipostd AS&lt;br /&gt;
 SELECT B.issuerstd, A.*&lt;br /&gt;
 FROM ipokeys AS A&lt;br /&gt;
 JOIN ipokeysstd AS B ON A.issuer=B.issuer;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY ipostd TO 'IPOMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
As with MA, match this file against the PortCoMatchInput file without mode 2. Then manually check the matches using the techniques described below.&lt;br /&gt;
&lt;br /&gt;
I generally use MAX for amounts and MIN for dates. I also chose to use MIN on text strings.&lt;br /&gt;
&lt;br /&gt;
==Cleaning IPO and MA Data==&lt;br /&gt;
It is important to follow Ed's direction of cleaning the data using aggregate function before putting the data into excel. This will keep you from a lot of manual checking that is unnecessary. When ready, paste the data you have into an excel file. In that excel file, I made three columns: one to check whether state codes were equivalent, one checking whether the date of first investment was 3 years before the MA or IPO, and one checking whether both of these conditions were satisfied for each company. I did this using simple if statements. This process is manual checking and filtering to see whether matches are correct or not and are thus extremely subjective and tedious. First, I went through and checked the companies that did not have equivalent state codes. If the company was one that I knew or the name was unique to the point that I did not believe the same name would appear in another state, I marked the state codes as equivalent. I did the same for the date of first investment vs MA/IPO date. Then I removed all duplicates that had the marking Warning Multiple Matches, and the data sheets were clean.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Process For Creating the PortCoExits Table==&lt;br /&gt;
===MA Process===&lt;br /&gt;
First we must load the clean, manually checked tables back into the database. &lt;br /&gt;
 DROP TABLE MAClean;&lt;br /&gt;
 CREATE TABLE MAClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  targetnamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  targetname varchar(255),&lt;br /&gt;
  targetstate varchar(10),&lt;br /&gt;
  announceddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY MAClean FROM 'MAClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --7205&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM MAClean)a;&lt;br /&gt;
 --7188&lt;br /&gt;
&lt;br /&gt;
As you can see there are still duplicate primary keys in the table. To get rid of these I wrote a query that chooses primary keys that occur only once and matches them against MANoDups. That way you will have unique primary keys by construction.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MACleanNoDups;&lt;br /&gt;
 CREATE TABLE MACleanNoDups AS&lt;br /&gt;
 SELECT A.*, effectivedate, transactionamt, enterpriseval, acquirorstatus&lt;br /&gt;
 FROM MAClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT targetname, targetstate, announceddate, COUNT(*) FROM MAClean&lt;br /&gt;
 	GROUP BY targetname, targetstate, announceddate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.targetname=B.targetname AND A.targetstate=B.targetstate AND A.announceddate=B.announceddate&lt;br /&gt;
 LEFT JOIN MANoDups AS C ON A.targetnamestd=C.targetname AND A.announceddate=C.announceddate;&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM MACleanNoDups)a;&lt;br /&gt;
 --7171&lt;br /&gt;
&lt;br /&gt;
Thus the portco primary key is unique in the table. We will use this later. &lt;br /&gt;
Now do the same for the IPOs.&lt;br /&gt;
&lt;br /&gt;
===IPO Process===&lt;br /&gt;
 DROP TABLE IPOClean;&lt;br /&gt;
 CREATE TABLE IPOClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  issuernamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  issuername varchar(255),&lt;br /&gt;
  issuerstate varchar(10),&lt;br /&gt;
  issuedate date&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY IPOClean FROM 'IPOClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2146&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuername, issuerstate, issuedate FROM IPOClean)a;&lt;br /&gt;
 --2141&lt;br /&gt;
&lt;br /&gt;
As with the MA process, there were duplicates in the clean IPO table. Get rid of these using the same process as with MAs. Only choose the primary keys that occur once and join these to the IPONoDups table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOCleanNoDups;&lt;br /&gt;
 CREATE TABLE IPOCleanNoDups AS&lt;br /&gt;
 SELECT A.*, principalamt, proceedsamt, naicode as naics, zipcode, status, foundeddate&lt;br /&gt;
 FROM IPOClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT issuername, issuerstate, issuedate, COUNT(*) FROM IPOClean&lt;br /&gt;
 	GROUP BY issuername, issuerstate, issuedate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.issuername=B.issuername AND A.issuerstate=B.issuerstate AND A.issuedate=B.issuedate&lt;br /&gt;
 LEFT JOIN IPONoDups AS C ON A.issuername=C.issuer AND A.issuerstate=C.statecode AND A.issuedate=C.issuedate;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM IPOCleanNoDups)a;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
Now the duplicates are out of the MAClean and IPOClean data and we can start to construct the ExitKeysClean table.&lt;br /&gt;
&lt;br /&gt;
==Creating ExitKeysClean==&lt;br /&gt;
&lt;br /&gt;
First I looked for the PortCos that were in both the MAs and the IPOs. I did this using:&lt;br /&gt;
 DROP TABLE IPOMAForReview;&lt;br /&gt;
 CREATE TABLE IPOMAForReview&lt;br /&gt;
 SELECT A.*, B.targetname, B.targetstate, B.announcedate&lt;br /&gt;
 FROM IPOCleanNoDups AS A&lt;br /&gt;
 JOIN MACleanNoDups AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv;&lt;br /&gt;
 --92&lt;br /&gt;
&lt;br /&gt;
I then pulled out the IPOs that were only IPOs and MAs that were only MAs. I also added in a column that indicated whether a company underwent an IPO or a MA.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPONoConflict;&lt;br /&gt;
 CREATE TABLE IPONoConflict AS&lt;br /&gt;
 SELECT A.*, 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOCleanNoDups AS A &lt;br /&gt;
 LEFT JOIN MACleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv &lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --2044&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MANoConflict;&lt;br /&gt;
 CREATE TABLE MANoConflict AS&lt;br /&gt;
 SELECT A.*, 0::int as IPOvsMA&lt;br /&gt;
 FROM MACleanNoDups AS A&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.Coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --7079&lt;br /&gt;
&lt;br /&gt;
Since 2136-92=2044 and 7171-92=7079, we know that the duplicate companies were extracted successfully.&lt;br /&gt;
&lt;br /&gt;
I then wrote a query to check whether the IPO issue date or announced date of the MA was earlier and used that to indicate whether I chose the company to have undergone an MA or an IPO in the column MSvsIPO. A 0 in the column represented an MA being chosen and a 1 represented an IPO being chosen.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Then out of this table I extracted the MAs and IPOs using the the created MAvsIPO flag:&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MASelected;&lt;br /&gt;
 CREATE TABLE MASelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 targetname, targetstate, announceddate,&lt;br /&gt;
 0::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;gt;= announceddate;&lt;br /&gt;
 --25&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOSelected;&lt;br /&gt;
 CREATE TABLE IPOSelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 issuername, issuerstate, issuedate,&lt;br /&gt;
 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;lt; announceddate;&lt;br /&gt;
 --67&lt;br /&gt;
&lt;br /&gt;
I then made the ExitKeysClean table using the portco primary key and the indicator MAvsIPO indicator column.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ExitKeys;&lt;br /&gt;
 CREATE TABLE ExitKeys AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, ipovsma FROM IPONoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM IPOSelected&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MANoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MASelected;&lt;br /&gt;
 --9215&lt;br /&gt;
&lt;br /&gt;
==Create the PortCoExit And PortCoAliveDead Tables==&lt;br /&gt;
From consulting with Ed and the VC Database Rebuild wiki, I decided to make the PortCoExit table with an mavsipo, an exitdate, an exited, and an exitvalue column. I use the MAvsIPO column to add in data. It is very important that you have constructed this column.&lt;br /&gt;
 DROP TABLE PortCoExit;&lt;br /&gt;
 CREATE TABLE PortCoExit AS&lt;br /&gt;
 SELECT A.coname, A.statecode, A.datefirstinv, A.datelastinv, A.city, B.ipovsma,&lt;br /&gt;
 CASE WHEN B.ipovsma IS NOT NULL THEN 1::int ELSE 0::int END AS Exit,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.proceedsamt::numeric WHEN ipovsma=0 THEN D.transactionamt::numeric ELSE NULL::numeric END AS exitvaluem,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.issuedate WHEN ipovsma=0 THEN D.announceddate ELSE NULL::date END AS exitdate,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN extract(year from C.issuedate) WHEN ipovsma=0 THEN extract(year from D.announceddate) ELSE NULL::int END AS exityear&lt;br /&gt;
 FROM companybasecore AS A&lt;br /&gt;
 LEFT JOIN ExitKeys AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS C ON A.coname=C.coname AND A.statecode=C.statecode AND A.datefirstinv=C.datefirstinv&lt;br /&gt;
 LEFT JOIN MACleanNoDups AS D ON A.coname=D.coname AND A.statecode=D.statecode AND A.datefirstinv=D.datefirstinv;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
I then used this table to build one that has information as to whether a company was dead or alive. I found this information by checking whether a company had undergone an IPO or MA, indicating the company was dead. Alternatively, if the company's date of last investment was more than 5 years ago, I marked the company as dead.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortCoAliveDead;&lt;br /&gt;
 CREATE TABLE PortCoAliveDead AS&lt;br /&gt;
 SELECT *, &lt;br /&gt;
 datefirstinv as alivedate, extract(year from datefirstinv) as aliveyear,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exitdate &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN (datelastinv + INTERVAL '5 year') &lt;br /&gt;
 	ELSE NULL::date END AS deaddate,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exityear &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN extract(year from (datelastinv + INTERVAL '5 year')) &lt;br /&gt;
 	ELSE NULL::int END AS deadyear&lt;br /&gt;
 FROM PortCoExit;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
==GeoCoding Companies, Firms, and Branch Offices==&lt;br /&gt;
A helpful page here is the [[Geocode.py]] page which explains how to use the Geocoding script. You will have to tweak the Geocode script when geocoding as each of these tables has a different primary key. It is vital that you include the primary keys in the file you input and output from the Geocoding script. Without these, you will not be able to join the latitudes and longitudes back to the firm, branch office, or company base tables.&lt;br /&gt;
&lt;br /&gt;
Geocoding costs money since we are using the Google Maps API. The process doesn't cost much, but in order to save money I tried to salvage as much of the preexisting geocode information I could find.&lt;br /&gt;
===Companies===&lt;br /&gt;
I found the table of old companies with latitudes and longitudes in vcdb2 and loaded these into vcdb3.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldgeocords;&lt;br /&gt;
 CREATE TABLE oldgeocords (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   ivestedk real,&lt;br /&gt;
   city varchar(255),&lt;br /&gt;
   addr1 varchar(255),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY oldgeocords FROM 'companybasegeomaster.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
The API occasionally will give erroneous latitude and longitude readings. In order to catch only the good ones, I found the latitude and longitude lines that encompass the mainland US and created an exclude flag to make sure companies were in this box. I then created flags to include companies in Puerto Rico, Hawaii, and Alaska. Companies that were in these places often had wrong latitude and longitude readings of 44.93, 7.54, so I ran a query making sure that these weren't listed. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords;&lt;br /&gt;
 CREATE TABLE geoallcoords AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldgeocords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords1;&lt;br /&gt;
 CREATE TABLE geoallcoords1 AS SELECT&lt;br /&gt;
 *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
I then included only companies that were either in the mainland US, Hawaii, Alaska, or Puerto Rico. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeoold;&lt;br /&gt;
 CREATE TABLE goodgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords1 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --38498&lt;br /&gt;
&lt;br /&gt;
I then found the remaining companies that needed to be geocoded. Only companies that have addresses listed are able to be accurately geocoded. If we attempt to geocode based on city, the location returned will simply be the center of the city. Thus, I chose the companies that we did not already have listings for and had a valid address.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaininggeo;&lt;br /&gt;
 CREATE TABLE remaininggeo AS SELECT A.coname, A.statecode, A.datefirstinv, A.addr1, A.addr2, A.city, A.zip FROM companybasecore AS A LEFT JOIN goodgeoold AS B ON A.coname=B.coname &lt;br /&gt;
 AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.coname IS NULL AND A.addr1 IS NOT NULL;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 \COPY remaininggeo TO 'RemainingGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I copied this table into excel to concatenate the address, city, state, and zipcode columns into one column. This can and should be done in SQL, but I was not aware this could be done. I then ran remaininggeo through the Geocode script with columns coname, statecode, datefirstinv, and address in the inputted file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaining;&lt;br /&gt;
 CREATE TABLE remaining (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY remaining FROM 'RemainingLatLong.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I then ran the same geographical checks on the newly geocoded companies and found all of the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords2;&lt;br /&gt;
 CREATE TABLE geoallcoords2 AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM remaining;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords3;&lt;br /&gt;
 CREATE TABLE geoallcoords3 AS&lt;br /&gt;
 SELECT *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int     0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords2;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeonew;&lt;br /&gt;
 CREATE TABLE goodgeonew AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords3 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5913&lt;br /&gt;
&lt;br /&gt;
I then combined the old and new geocodes and matched them back to the company base table to get a geo table for companies.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geocodesportco;&lt;br /&gt;
 CREATE TABLE geocodesportco AS SELECT&lt;br /&gt;
 A.* from goodgeonew &lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT B.* from goodgeoold;&lt;br /&gt;
 --44411&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcogeo;&lt;br /&gt;
 CREATE TABLE portcogeo AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude FROM companybasecore AS A LEFT JOIN Geocodesportco AS B ON A.coname=B.coname AND A.datefirstinv=B.datefirstinv AND A.statecode=B.statecode;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
===Firms===&lt;br /&gt;
This process is largely the same as for companies. I found old firms that had already been geocoded and checked for accuracy.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldfirmcoords;&lt;br /&gt;
 CREATE TABLE oldfirmcoords (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY oldfirmcoords FROM 'FirmCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmoldfilter;&lt;br /&gt;
 CREATE TABLE firmoldfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldfirmcoords;&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
Since oldfirmcoords does not have state codes, we have to find a way to include state codes to add in companies based in Puerto Rico, Hawaii, and Alaska. I did this by matching the firmoldfilter table back to the firm base table. I used the coalesce function because we wanted to exclude companies that we had not geocoded due to faulty addresses. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch1;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch1 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmbasecore AS A LEFT JOIN firmoldfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Then the process of tagging the PR, HI, and AK companies and including only correctly tagged companies is the same as for companies. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternal;&lt;br /&gt;
 CREATE TABLE firmcoordsexternal AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch1;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeoold;&lt;br /&gt;
 CREATE TABLE goodfirmgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternal AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5346&lt;br /&gt;
&lt;br /&gt;
Find the remaining firms and run the geocode script on these firms&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remainingfirm;&lt;br /&gt;
 CREATE TABLE remainingfirm AS SELECT A.firmname, A.addr1, A.addr2, A.city, A.state, A.zip FROM firmcoreonedupremoved AS A LEFT JOIN goodfirmgeoold AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE B.firmname IS NULL AND A.addr1 IS NOT NULL AND A.msacode!='9999';&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 \COPY remainingfirm TO 'FirmGeoRemaining.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmremainingcoords;&lt;br /&gt;
 CREATE TABLE firmremainingcoords(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY firmremainingcoords FROM 'FirmRemainingCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
Follow the same filtering process as above to get the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmnewfilter;&lt;br /&gt;
 CREATE TABLE firmnewfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM firmremainingcoords;&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch2;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch2 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmcoreonedupremoved AS A LEFT JOIN firmnewfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternalremaining;&lt;br /&gt;
 CREATE TABLE firmcoordsexternalremaining AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch2;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeonew;&lt;br /&gt;
 CREATE TABLE goodfirmgeonew AS SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternalremaining AS B &lt;br /&gt;
 ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --703&lt;br /&gt;
&lt;br /&gt;
Combine the old and new geocoded firms and match them to firm base to get a firm geo table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocoords;&lt;br /&gt;
 CREATE TABLE firmgeocoords AS&lt;br /&gt;
 SELECT * FROM goodfirmgeonew&lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT * FROM goodfirmgeoold;&lt;br /&gt;
 --6049&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocore;&lt;br /&gt;
 CREATE TABLE firmgeocore AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude FROM firmbasecore AS A LEFT JOIN firmgeocoords AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
===Branch Offices===&lt;br /&gt;
I did not use old branch office data because I could not find it anywhere in the old data set. I have since found old data in the table firmbasecoords in vcdb2. &lt;br /&gt;
&lt;br /&gt;
First copy all of the needed data out of the database to do geocoding.&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT A.firmname, A.boaddr1, A.boaddr2, A.bocity, A.bostate, A.bozip FROM bonound AS A WHERE A.boaddr1 IS NOT NULL) TO 'BranchOffices.txt' WITH DELIMITER AS E'\t' HEADER &lt;br /&gt;
 NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
Then load the data into the database and follow the same filtering process as above.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo;&lt;br /&gt;
 CREATE TABLE bogeo (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY bogeo FROM 'BranchOfficesGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo1;&lt;br /&gt;
 CREATE TABLE bogeo1 AS SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM bogeo;&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bomatchgeo;&lt;br /&gt;
 CREATE TABLE bomatchgeo AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM branchofficecore AS A LEFT JOIN bogeo1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo2;&lt;br /&gt;
 CREATE TABLE bogeo2 AS&lt;br /&gt;
 SELECT *, CASE WHEN bostate='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN bostate='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN bostate='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM bomatchgeo;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Match the correctly geocoded branch offices back to firm base to get the final table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeocore1;&lt;br /&gt;
 CREATE TABLE bogeocore1 AS&lt;br /&gt;
 SELECT * FROM bogeo2 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --1161&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbogeo;&lt;br /&gt;
 CREATE TABLE firmbogeo AS&lt;br /&gt;
 SELECT A.*, B.latitude AS BOLatitude, B.longitude AS BOLongitude FROM firmgeocore AS A LEFT JOIN bogeocore1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
==Creating People Tables==&lt;br /&gt;
We pulled data on executives in both portcos and funds. I describe the process below. If any of the explanations don't make sense, I also describe most tables in the section called Marcos's Code.&lt;br /&gt;
===Company People===&lt;br /&gt;
 DROP TABLE titlelookup;&lt;br /&gt;
 CREATE TABLE titlelookup(&lt;br /&gt;
 	fulltitle varchar(150),&lt;br /&gt;
 	charman int, &lt;br /&gt;
 	ceo int,&lt;br /&gt;
 	cfo int,&lt;br /&gt;
 	coo int,&lt;br /&gt;
 	cio int,&lt;br /&gt;
 	cto int,&lt;br /&gt;
 	otherclvl int,&lt;br /&gt;
 	boardmember int,&lt;br /&gt;
 	president int,&lt;br /&gt;
 	vp int,&lt;br /&gt;
 	founder int,&lt;br /&gt;
 	director int&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY titlelookup FROM 'Important Titles in Women2017 dataset.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --628&lt;br /&gt;
&lt;br /&gt;
This table lists various titles one can have and identifies where they fall under traditional executive titles.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeople;&lt;br /&gt;
 CREATE TABLE copeople(&lt;br /&gt;
 	datefirstinv   date,&lt;br /&gt;
 	cname varchar(150),&lt;br /&gt;
 	statecode  varchar(2),&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	nonmanaging  varchar(1),&lt;br /&gt;
 	prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY copeople FROM 'Executives-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
This table gets various executives from portcos. This is loaded from SDC. Next we have to identify which traditional executive level job the listed job title corresponds to. It also identifies whether a prefix identifies an executive as male or female. I made a stupid mistake of writing cname instead of coname when loading in the data. If you want to save yourself work, write coname.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeoplebase;&lt;br /&gt;
 CREATE TABLE copeoplebase AS&lt;br /&gt;
 SELECT copeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson,&lt;br /&gt;
 CASE WHEN fulltitle IS NOT NULL THEN 1::int ELSE 0::int END AS hastitlelookup,&lt;br /&gt;
 CASE WHEN charman IS NOT NULL THEN charman ELSE 0::int END AS chairman,&lt;br /&gt;
 CASE WHEN ceo IS NOT NULL THEN ceo ELSE 0::int END AS ceo,&lt;br /&gt;
 CASE WHEN cfo IS NOT NULL THEN cfo ELSE 0::int END AS cfo,&lt;br /&gt;
 CASE WHEN coo IS NOT NULL THEN coo ELSE 0::int END AS coo,&lt;br /&gt;
 CASE WHEN cio IS NOT NULL THEN cio ELSE 0::int END AS cio,&lt;br /&gt;
 CASE WHEN cto IS NOT NULL THEN cto ELSE 0::int END AS cto,&lt;br /&gt;
 CASE WHEN otherclvl IS NOT NULL THEN otherclvl ELSE 0::int END AS otherclvl,&lt;br /&gt;
 CASE WHEN boardmember IS NOT NULL THEN boardmember ELSE 0::int END AS boardmember,&lt;br /&gt;
 CASE WHEN president IS NOT NULL THEN president ELSE 0::int END AS president,&lt;br /&gt;
 CASE WHEN vp IS NOT NULL THEN vp ELSE 0::int END AS vp,&lt;br /&gt;
 CASE WHEN founder IS NOT NULL THEN founder ELSE 0::int END AS founder,&lt;br /&gt;
 CASE WHEN director IS NOT NULL THEN director ELSE 0::int END AS director&lt;br /&gt;
 FROM copeople&lt;br /&gt;
 LEFT JOIN titlelookup ON copeople.jobtitle=titlelookup.fulltitle;&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
Next we will try to identify whether an executive is male or female based on their names.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE namegender;&lt;br /&gt;
 CREATE TABLE namegender AS&lt;br /&gt;
 SELECT firstname, &lt;br /&gt;
 CASE WHEN countfemale &amp;gt; 0 AND countmale=0 THEN 1::int ELSE 0::int END AS exclusivelyfemale,&lt;br /&gt;
 CASE WHEN countmale &amp;gt; 0 AND countfemale=0 THEN 1::int ELSE 0::int END AS exclusivelymale&lt;br /&gt;
 FROM&lt;br /&gt;
 	(SELECT firstname, COALESCE(sum(titlefemale),0) as countfemale,  COALESCE(sum(titlemale),0) as countmale &lt;br /&gt;
 	FROM copeoplebase WHERE doctor=0&lt;br /&gt;
 	GROUP BY firstname) As T&lt;br /&gt;
 WHERE NOT (countfemale &amp;gt; 0 AND countmale&amp;gt;0);&lt;br /&gt;
 --12736&lt;br /&gt;
&lt;br /&gt;
The next table expands CoPeopleBase to include information on executive gender and executive position.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleFull;&lt;br /&gt;
 CREATE TABLE CoPeopleFull AS&lt;br /&gt;
 SELECT copeoplebase.*,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1 THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender,&lt;br /&gt;
 CASE WHEN (ceo=1 OR president=1) THEN 1::int ELSE 0::int END AS ceopres,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1) THEN 1::int ELSE 0::int END AS CLevel,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1) THEN 1::int ELSE 0::int END AS board,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1 OR vp=1 OR founder=1) THEN 1::int ELSE &lt;br /&gt;
 0::int END AS vpandabove&lt;br /&gt;
 FROM copeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=copeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --177547&lt;br /&gt;
&lt;br /&gt;
The next table only keeps executive listings that have a valid portco primary key associated with them. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleKey;&lt;br /&gt;
 CREATE TABLE CoPeopleKey AS&lt;br /&gt;
 SELECT A.*&lt;br /&gt;
 FROM CoPeopleFull AS A&lt;br /&gt;
 JOIN (SELECT firstname, lastname, cname, datefirstinv, statecode, count(*) FROM CoPeopleFull &lt;br /&gt;
 WHERE firstname IS NOT NULL AND lastname IS NOT NULL AND cname IS NOT NULL AND datefirstinv IS NOT NULL AND statecode IS NOT NULL&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode HAVING COUNT(*)=1) AS B&lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The next table identifies whether a person previously held executive positions.&lt;br /&gt;
&lt;br /&gt;
 CREATE TABLE CoPeopleSerial AS&lt;br /&gt;
 SELECT firstname, lastname, cname, datefirstinv, statecode, &lt;br /&gt;
 COALESCE(sum(hasperson),0) as prev,&lt;br /&gt;
 COALESCE(sum(ceo),0) as prevceo,&lt;br /&gt;
 COALESCE(sum(ceopres),0) as prevceopres,&lt;br /&gt;
 COALESCE(sum(founder),0) as prevfounder,&lt;br /&gt;
 COALESCE(sum(clevel),0) as prevclevel,&lt;br /&gt;
 COALESCE(sum(board),0) as prevboard,&lt;br /&gt;
 COALESCE(sum(vpandabove),0) as prevvpandabove,&lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serial,&lt;br /&gt;
 CASE WHEN sum(ceo) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceo,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialvpandabove&lt;br /&gt;
 FROM (&lt;br /&gt;
 	SELECT A.prefix, A.firstname, A.lastname, A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 	B.cname as prevcname, B.datefirstinv as prevdatefirstinv, B.statecode as prevstatecode, B.ceo, B.ceopres, B.founder, B.clevel, B.board, B.vpandabove, B.hasperson&lt;br /&gt;
 	FROM CoPeopleKey AS A&lt;br /&gt;
 	LEFT JOIN CoPeopleKey AS B ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv &amp;gt; B.datefirstinv&lt;br /&gt;
 ) AS T&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The last table aggregates a ton of information on executives for each company. There is too much information to explain it all. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeopleagg;&lt;br /&gt;
 CREATE TABLE copeopleagg AS&lt;br /&gt;
 SELECT A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 sum(hasperson) as numperson,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasvpandabove,&lt;br /&gt;
 sum(female) as females,&lt;br /&gt;
 sum(male) as males,&lt;br /&gt;
 sum(unknowngender) as ugs,&lt;br /&gt;
 sum(doctor*female) as femaledoctors,&lt;br /&gt;
 sum(doctor*male) as maledoctors,&lt;br /&gt;
 sum(doctor*unknowngender) as ugdoctors,&lt;br /&gt;
 sum(ceopres*female) as femaleceos,&lt;br /&gt;
 sum(ceopres*male) as maleceos,&lt;br /&gt;
 sum(ceopres*unknowngender) as ugceos,&lt;br /&gt;
 sum(ceopres*female*doctor) as femaledoctorceos,&lt;br /&gt;
 sum(ceopres*male*doctor) as maledoctorceos,&lt;br /&gt;
 sum(ceopres*unknowngender*doctor) as ugdoctorceos,&lt;br /&gt;
 sum(founder*female) as femalefounders,&lt;br /&gt;
 sum(founder*male) as malefounders,&lt;br /&gt;
 sum(founder*unknowngender) as ugfounders,&lt;br /&gt;
 sum(founder*female*doctor) as femaledoctorfounders,&lt;br /&gt;
 sum(founder*male*doctor) as maledoctorfounders,&lt;br /&gt;
 sum(founder*unknowngender*doctor) as ugdoctorfounders,&lt;br /&gt;
 sum(clevel*female) as femaleclevels,&lt;br /&gt;
 sum(clevel*male) as maleclevels,&lt;br /&gt;
 sum(clevel*unknowngender) as ugclevels,&lt;br /&gt;
 sum(clevel*female*doctor) as femaledoctorclevels,&lt;br /&gt;
 sum(clevel*male*doctor) as maledoctorclevels,&lt;br /&gt;
 sum(clevel*unknowngender*doctor) as ugdoctorclevels,&lt;br /&gt;
 sum(board*female) as femaleboards,&lt;br /&gt;
 sum(board*male) as maleboards,&lt;br /&gt;
 sum(board*unknowngender) as ugboards,&lt;br /&gt;
 sum(board*female*doctor) as femaledoctorboards,&lt;br /&gt;
 sum(board*male*doctor) as maledoctorboards,&lt;br /&gt;
 sum(board*unknowngender*doctor) as ugdoctorboards,&lt;br /&gt;
 sum(vpandabove*female) as femaleabovevps,&lt;br /&gt;
 sum(vpandabove*male) as maleabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender) as ugabovevps,&lt;br /&gt;
 sum(vpandabove*female*doctor) as femaledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*male*doctor) as maledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender*doctor) as ugdoctorabovevps,&lt;br /&gt;
 sum(prev*female) as femaleprevs,&lt;br /&gt;
 sum(prev*male) as maleprevs,&lt;br /&gt;
 sum(prev*unknowngender) as ugprevs,&lt;br /&gt;
 sum(prevceopres*female) as femaleprevceopres,&lt;br /&gt;
 sum(prevceopres*male) as maleprevceopres,&lt;br /&gt;
 sum(prevceopres*unknowngender) as ugprevceopres,&lt;br /&gt;
 sum(prevfounder*female) as femaleprevfounder,&lt;br /&gt;
 sum(prevfounder*male) as maleprevfounder,&lt;br /&gt;
 sum(prevfounder*unknowngender) as ugprevfounder,&lt;br /&gt;
 sum(prevclevel*female) as femaleprevclevel,&lt;br /&gt;
 sum(prevclevel*male) as maleprevclevel,&lt;br /&gt;
 sum(prevclevel*unknowngender) as ugprevclevel,&lt;br /&gt;
 sum(prevboard*female) as femaleprevboard,&lt;br /&gt;
 sum(prevboard*male) as maleprevboard,&lt;br /&gt;
 sum(prevboard*unknowngender) as ugprevboard,&lt;br /&gt;
 sum(prevvpandabove*female) as femaleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*male) as maleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*unknowngender) as ugprevvpandabove,&lt;br /&gt;
 sum(serial*female) as femaleserials,&lt;br /&gt;
 sum(serial*male) as maleserials,&lt;br /&gt;
 sum(serial*unknowngender) as ugserials,&lt;br /&gt;
 sum(serialceopres*female) as femaleserialceopres,&lt;br /&gt;
 sum(serialceopres*male) as maleserialceopres,&lt;br /&gt;
 sum(serialceopres*unknowngender) as ugserialceopres,&lt;br /&gt;
 sum(serialfounder*female) as femaleserialfounder,&lt;br /&gt;
 sum(serialfounder*male) as maleserialfounder,&lt;br /&gt;
 sum(serialfounder*unknowngender) as ugserialfounder,&lt;br /&gt;
 sum(serialclevel*female) as femaleserialclevel,&lt;br /&gt;
 sum(serialclevel*male) as maleserialclevel,&lt;br /&gt;
 sum(serialclevel*unknowngender) as ugserialclevel,&lt;br /&gt;
 sum(serialboard*female) as femaleserialboard,&lt;br /&gt;
 sum(serialboard*male) as maleserialboard,&lt;br /&gt;
 sum(serialboard*unknowngender) as ugserialboard,&lt;br /&gt;
 sum(serialvpandabove*female) as femaleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*male) as maleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*unknowngender) as ugserialvpandabove,&lt;br /&gt;
 sum(ceopres*serialceopres*female) as femaleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*male) as maleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*unknowngender) as ugceopresserialceopres,&lt;br /&gt;
 sum(founder*serialfounder*female) as femalefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*male) as malefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*unknowngender) as ugfounderserialfounder &lt;br /&gt;
 FROM CoPeoplekey AS A&lt;br /&gt;
 JOIN CoPeopleSerial AS B &lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode&lt;br /&gt;
 GROUP BY A.cname, A.datefirstinv, A.statecode;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
Since this table is so big, it is a good idea to have a smaller, more manageable table to work with. &lt;br /&gt;
&lt;br /&gt;
DROP TABLE copeopleaggsimple;&lt;br /&gt;
 CREATE TABLE copeopleaggsimple AS&lt;br /&gt;
 SELECT cname, datefirstinv, statecode, numperson, females, males, ugs, ugdoctors, maleserials+femaleserials+ugserials AS serials&lt;br /&gt;
 FROM copeopleagg;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
===Fund People===&lt;br /&gt;
Luckily, this process is much easier than the company people process. First we must simply load the data into the db.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeople;&lt;br /&gt;
 CREATE TABLE fundpeople(&lt;br /&gt;
 	fundname  varchar(255),&lt;br /&gt;
 	fundyear  int,&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	 prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundpeople FROM 'Executives-Funds-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table identifies degree and sex information about the executives of the fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeoplebase;&lt;br /&gt;
 CREATE TABLE fundpeoplebase AS&lt;br /&gt;
 SELECT fundpeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson&lt;br /&gt;
 FROM fundpeople;&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table tries to identify the sex of the executive using the above defined namegender table. It only selects rows where a person is actually listed.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleFull;&lt;br /&gt;
 CREATE TABLE FundPeopleFull AS&lt;br /&gt;
 SELECT fundpeoplebase.*, exclusivelyfemale, exclusivelymale,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 AND exclusivelymale=0 AND (titlemale=0 OR titlemale IS NULL) THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1  AND exclusivelyfemale=0 AND (titlefemale =0 OR titlefemale IS NULL) THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender&lt;br /&gt;
 FROM fundpeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=fundpeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --320915&lt;br /&gt;
&lt;br /&gt;
The next table gives you information on executives aggregated by fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleAgg;&lt;br /&gt;
 CREATE TABLE FundPeopleAgg AS&lt;br /&gt;
 SELECT fundname, &lt;br /&gt;
 sum(female) as numfemale,&lt;br /&gt;
 sum(male) as nummale,&lt;br /&gt;
 sum(unknowngender) as numunknowngender,&lt;br /&gt;
 sum(doctor) as numdoctor,&lt;br /&gt;
 sum(female*doctor) as numfemaledoctor,&lt;br /&gt;
 sum(male*doctor) as nummaledoctor,&lt;br /&gt;
 sum(unknowngender*doctor) as numunknowngenderdoctor,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 sum(hasperson) as numpeople, &lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt; 0 THEN sum(female)/sum(hasperson) ELSE NULL END as fracfemale,&lt;br /&gt;
 CASE WHEN sum(male) &amp;gt; 0 THEN sum(female)/sum(male) ELSE NULL END as ratiofemale&lt;br /&gt;
 FROM FundPeopleFull&lt;br /&gt;
 GROUP BY fundname;&lt;br /&gt;
 --21536&lt;br /&gt;
&lt;br /&gt;
It is also good to have this information on firms. We do not pull firm people information from SDC. However, we have enough information to create it from preexisting tables.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmpeopleagg;&lt;br /&gt;
 CREATE TABLE firmpeopleagg AS &lt;br /&gt;
 SELECT _firmname as firmname, sum(numfemale) as firmwomen, sum(nummale) as firmmen, sum(numunknowngender) as firmugs, &lt;br /&gt;
 sum(numdoctor) as firmdoctors, sum(numpeople) as firmpeople,&lt;br /&gt;
 CASE WHEN sum(numpeople) &amp;gt; 0 THEN (sum(numfemale)/sum(numpeople))::real ELSE NULL END as firmfracwomen,&lt;br /&gt;
 CASE WHEN sum(nummale) &amp;gt; 0 THEN (sum(numfemale)/sum(nummale))::real ELSE NULL END as firmratiowomen&lt;br /&gt;
 FROM roundlineaggfunds AS A&lt;br /&gt;
 JOIN fundpeopleagg AS B ON A._fundname=B.fundname&lt;br /&gt;
 GROUP BY _firmname;&lt;br /&gt;
 --5273&lt;br /&gt;
&lt;br /&gt;
==Marcos's Code==&lt;br /&gt;
This is code that a Rice student, Marcos Lee, wrote. I cleaned it and ran it. I have described the tables that I built and where they come from below. My code is located in:&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\MatchingEntrepsV3&lt;br /&gt;
&lt;br /&gt;
If you have issues understanding my explanation, go to this location and read the query. Most of them are straight forward. &lt;br /&gt;
===Describing Stacks Created in Code===&lt;br /&gt;
 CoPeopleBase:&lt;br /&gt;
 -Builds from copeople and titlelookup&lt;br /&gt;
 -Identifies what roles people played in their companies&lt;br /&gt;
&lt;br /&gt;
 namegender:&lt;br /&gt;
 -built from copeoplebase&lt;br /&gt;
 -identifies male/female/unknown&lt;br /&gt;
&lt;br /&gt;
 CoPeopleFull:&lt;br /&gt;
 -built from copeoplebase and namegender&lt;br /&gt;
 -builds more extensive information on executive including speficially what level of executive they are&lt;br /&gt;
&lt;br /&gt;
 CoPeopleKey:&lt;br /&gt;
 -built from CoPeopleFull&lt;br /&gt;
 -creates table where only executives with full primary keys are kept&lt;br /&gt;
&lt;br /&gt;
 CoPeopleSerial:&lt;br /&gt;
 -built from copeoplekey&lt;br /&gt;
 -keeps track of executives previous jobs at executive level&lt;br /&gt;
&lt;br /&gt;
 CoPoepleAgg:&lt;br /&gt;
 -built from copeoplekey and copeopleserial&lt;br /&gt;
 -gets extensive information on executives for each company&lt;br /&gt;
&lt;br /&gt;
 FundPeopleBae:&lt;br /&gt;
 -built from fundpeople&lt;br /&gt;
 -identifies male/female/doctor&lt;br /&gt;
 -hasperson column slightly weird because we can only have the lastname without prefix or first name and still have a 1 in column. Seems to be of little use/too broad&lt;br /&gt;
&lt;br /&gt;
 FundPeopleFull:&lt;br /&gt;
 -built from fundpeoplebase, namegender&lt;br /&gt;
 -adds in male/female &lt;br /&gt;
&lt;br /&gt;
 Fundpeopleagg:&lt;br /&gt;
 -built from fundpeoplefull&lt;br /&gt;
 -has aggregations of gender info for each fund&lt;br /&gt;
&lt;br /&gt;
 RoundLineJoinerLeanffWlistno:&lt;br /&gt;
 -built from rounlinejoinerleanff&lt;br /&gt;
 -adds listno to funds&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggFunds:&lt;br /&gt;
 -built from roundlinejoinerleanffwlistno and rounlineaggfirms&lt;br /&gt;
 -if there are two funds from one firm that invest in same portco, we choose only one and leave the others behind&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggWExit:&lt;br /&gt;
 -built from roundlineaggfirms, portcoexitupdated, roundlineaggfunds&lt;br /&gt;
 -adds in exit information for each company in roundlineaggfirms&lt;br /&gt;
&lt;br /&gt;
 FirmPerf:&lt;br /&gt;
 -built from roundlineaggwexit&lt;br /&gt;
 -adds in various performance measures for a given firm &lt;br /&gt;
&lt;br /&gt;
 PortCoFundDemo:&lt;br /&gt;
 -built from roundlinejoinerleanffclean and fundpeopleagg&lt;br /&gt;
 -gives information on executives of funds who invested in the portcos&lt;br /&gt;
&lt;br /&gt;
 PortCoPeopleMaster:&lt;br /&gt;
 -built from PortCoMaster, PortCoIndustry, PortCoPatent, PortCoSBIR, copeoplagg, PortCoFundDemo, CPI, statelookupint&lt;br /&gt;
 -huge amount of data about companies and their executives&lt;br /&gt;
&lt;br /&gt;
 RoundAggDistBase:&lt;br /&gt;
 -built from portcogeo, firmbogeo, roundlineaggwexit&lt;br /&gt;
 -creates geographic points using long, lat from geocoding&lt;br /&gt;
&lt;br /&gt;
 RoundAggDist:&lt;br /&gt;
 -Built from roundaggdistbase&lt;br /&gt;
 -gets actual distances between portcos and firms. if branch office exists and distance is less than distance to firm chooses that also generates random number&lt;br /&gt;
&lt;br /&gt;
 FirmPeopleAgg:&lt;br /&gt;
 -built from roundlineaggfunds, fundpeopleagg&lt;br /&gt;
 -finds information on executives from different firms&lt;br /&gt;
&lt;br /&gt;
 PortCoMatchmaster:&lt;br /&gt;
 -built from portcopatent, porcoindustry, portcosbir, copeopleaggsimple, portcoid&lt;br /&gt;
 -gets all information together about portcos&lt;br /&gt;
&lt;br /&gt;
 FirmMatchMaster:&lt;br /&gt;
 -built from firmperf, firmvars, firmpeopleagg, firmid&lt;br /&gt;
 -gets all information together about firms&lt;br /&gt;
&lt;br /&gt;
 RoundLineMasterBase:&lt;br /&gt;
 -built from portcomatchmaster, firmmatchmaster, roundaggdist, roundlineaggwexit&lt;br /&gt;
 -builds large amount of information about portcos and firms spceifically info about exits and distances&lt;br /&gt;
&lt;br /&gt;
 MatchMostNumerous:&lt;br /&gt;
 -built from roundlinemasterbase&lt;br /&gt;
 -finds max number of portcos invested in by a firm that also invested in the company grouping by&lt;br /&gt;
&lt;br /&gt;
 MatchHighestRandom:&lt;br /&gt;
 -built from matchmostnumerous&lt;br /&gt;
 -if two firms that invested in one company had the same number of max port cos this randomly chooses one company&lt;br /&gt;
&lt;br /&gt;
 FirmActiveYearsCode20:&lt;br /&gt;
 -built from roundlinejoinerleanffclean, porcoindustry&lt;br /&gt;
 -adds firmname to industry code not exactly sure why distinct is used in query&lt;br /&gt;
&lt;br /&gt;
 RealMatchesCode20:&lt;br /&gt;
 -built from MatchHighestRandom, PortCoIndustry&lt;br /&gt;
 -real matches between portcos and firms that invested in them including the code20&lt;br /&gt;
&lt;br /&gt;
 SyntheticFirmSetBaseCode20:&lt;br /&gt;
 -built from realmatchescode20, firmactiveyarscode20&lt;br /&gt;
 -crossproduct of firms and portcos. finds firms that invested in same year as portco received first inv, firms invested in same type of company, and makes sure matches are unique&lt;br /&gt;
&lt;br /&gt;
 AllMatchKeys:&lt;br /&gt;
 -built from SyntheticFirmSetBaseCode20, RealMatchesCode20&lt;br /&gt;
 -combines synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAggDistBaseCode20:&lt;br /&gt;
 -built from allmatchkeys, portcogeo, firmbogeo&lt;br /&gt;
 -builds points for all portco, firm listings in allmatch keys&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAddDistCode20:&lt;br /&gt;
 -built from synthroundaggdistvasecode20&lt;br /&gt;
 -finds actual distance between portcos and firms using installed extensions chooses branch offices if distance between portco and bo less than firm&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameInduBlowoutCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -gets every firm combination and checks whehter the companies that those firms invested in are in the same general industry&lt;br /&gt;
&lt;br /&gt;
 SynthFirmNameroundInduHistCode20:&lt;br /&gt;
 -built from SynthFirmnameInduBlowoutcode20&lt;br /&gt;
 -gets information by portco, firmname match about what the firms past investment patterns are&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthBaseCode20Portco:&lt;br /&gt;
 -built from Allmatchkeys, matchhighestrandom, synthroundaggdistcode20, sythnfirmnameroundinduhistcode20, synthfirmnameroundindutotalcode20, firmvars, copeopleaggsimple, portcomaster&lt;br /&gt;
 -builds a bunch of information about synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameRoundInduTotalCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -finds number of portcos in certain industries by firmnames&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthCode20Firms:&lt;br /&gt;
 -built with firmmatchmaster, allmatchkeys&lt;br /&gt;
 -matching a bunch of information to all firms&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthcode20:&lt;br /&gt;
 -built from masterwithsynthbasecode20portco, masterwithsynthcode20firms&lt;br /&gt;
 -gets a huge amount of info together on real and synthetic matches about firms and companies&lt;br /&gt;
&lt;br /&gt;
 MasterReals:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just real matches from code&lt;br /&gt;
&lt;br /&gt;
 MasterOneSynth:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just one randomly chosen synthetic match between companies and firms&lt;br /&gt;
&lt;br /&gt;
 MasterRealOneSynth:&lt;br /&gt;
 -built from masteronesynth, masterreals&lt;br /&gt;
 -combines the real and one synth table&lt;br /&gt;
&lt;br /&gt;
==Ranking Tables and Graphs==&lt;br /&gt;
This is a slight detour from the creation of VCDB3. However, this is a cool process because you actually get to use the data you've been working with. This process is extensive, but the queries are easy to understand. If you wish to have deeper understanding of the process, read the code. It is located in:&lt;br /&gt;
&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\RoundRanking.SQL&lt;br /&gt;
&lt;br /&gt;
First you must create a table that has aggregate round information grouped by cities and round year. Since this is a little difficult to picture, I will attach the code.&lt;br /&gt;
 DROP TABLE roundleveloutput;&lt;br /&gt;
 CREATE TABLE roundleveloutput AS SELECT&lt;br /&gt;
 city, statecode, roundyear AS year,&lt;br /&gt;
 SUM(rndamtestm*seedflag) AS seedamnt,&lt;br /&gt;
 SUM(rndamtestm*earlyflag) AS earlyamnt,&lt;br /&gt;
 SUM(rndamtestm*laterflag) AS lateramnt,&lt;br /&gt;
 SUM(rndamtestm*growthflag) AS selamnt,&lt;br /&gt;
 SUM(growthflag*dealflag) AS numseldeals&lt;br /&gt;
 FROM round GROUP BY city, statecode, roundyear;&lt;br /&gt;
 --30028&lt;br /&gt;
&lt;br /&gt;
Next create a table that lists the all time SEL amount by city. Keep including the state code since this will ensure that you have the right city. City names are often repeated in different states. Next, create a table which lists unique city, state for every year since 1980. Then, build a table which matches portcos to the city, state, year blowout table for each year they were alive. This table should be relatively large since it lists companies once for every year they were alive up until the present. Then create a table that displays the number of companies alive in a city every year since 1980.  Then add in a table that lists all of the information you have built in tables previously based on city, state, year. Also add in population. Then you can run the ranking queries.&lt;br /&gt;
&lt;br /&gt;
For states follow the same general process but group by states not cities and states. &lt;br /&gt;
&lt;br /&gt;
If this explanation was not enough for you (it was not meant to be in depth) go to the location defined above and read the actual code. With the description I have given, you should be able to piece together what each query does.&lt;br /&gt;
&lt;br /&gt;
==Master Tables==&lt;br /&gt;
Throughout the creation of the database, there are inevitably some tables that are vital to create a solid foundation. The following tables are the master tables with a quick explanation:&lt;br /&gt;
* '''Companybasecore'''- The base table for portcos. This is data that was drawn directly from SDC and was not changed other than for cleaning purposes. Count: 48001&lt;br /&gt;
* '''BranchOfficeCore'''- The base table for branch offices. This is data drawn directly from SDC. Here only branch offices with distinct firm names are included. Count: 10032&lt;br /&gt;
* '''FirmBaseCore'''- The base table for firms. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 15437&lt;br /&gt;
* '''FundBaseCore'''- The base table for funds. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 28833&lt;br /&gt;
* '''IPOCleanNoDups''' - This is the clean table of IPOs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 2136&lt;br /&gt;
* '''IPONoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here. Be careful using this table. Count: 11149&lt;br /&gt;
* '''MACleanNoDups'''- This is the clean table of MAs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 7171&lt;br /&gt;
* '''MANoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here as well. Be careful using this table. Count: 119374&lt;br /&gt;
* '''Round'''- This is the master round table. It has SEL flags attached to it and has the most round info. RoundBaseClean is also a decent table but has less information. This table is your best bet for round information. Count: 151323&lt;br /&gt;
* '''RoundLineJoinerLeanFFClean'''- This is the master round table for joining purposes. It was cleaned and used for widespread joining purposes. Count: 163157&lt;br /&gt;
* '''CoPeople'''- This is the base table for PortCo people information. It was pulled directly from SDC. Count: 194359&lt;br /&gt;
* '''FirmBoGeo'''- This is the base table for firm/branch office geocoding. This table was cleaned and contains lat/long readings for firms and branch offices where the information was available. Count: 15437&lt;br /&gt;
* '''PortCoGeo'''- This is the base table for portco geocoding. Table was cleaned and contains lat/long reading for portcos where the Google API returned a valid reading. Count: 48001&lt;br /&gt;
* '''FirmPerf'''- This is a wide reaching table about the performance of firms. It was mainly used later in the project but is extremely useful. Count: 8336&lt;br /&gt;
* '''FundPeople'''- This is the base table for fund people information. It was pulled directly from SDC. Count: 328994.&lt;br /&gt;
* '''PortCoExitUpdated'''- This is the master exit table for portcos. The difference between this and PortCoExit is that Updated has two columns marking MAs and IPOs while the other has one column MAvsIPO. Use which ever one is more convenient. Count: 48001&lt;br /&gt;
* '''PortCoMaster'''- This table is great. There's a ton of information on PortCos including SEL flags, round amounts, and industry classifications. Count: 48001&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=Measuring_High-Growth_High-Technology_Entrepreneurship_Ecosystems&amp;diff=48669</id>
		<title>Measuring High-Growth High-Technology Entrepreneurship Ecosystems</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=Measuring_High-Growth_High-Technology_Entrepreneurship_Ecosystems&amp;diff=48669"/>
		<updated>2024-11-30T19:47:51Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{AcademicPaper&lt;br /&gt;
|Has title=Measuring High-Growth High-Technology Entrepreneurship Ecosystems&lt;br /&gt;
|Has author=Ed Egan,&lt;br /&gt;
|Has paper status=Published&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==Final Version==&lt;br /&gt;
&lt;br /&gt;
*The final version was accepted to Research Policy on May 17th, 2021. &lt;br /&gt;
*The 50-day share link is: https://authors.elsevier.com/a/1d8SaB5ASINVf &lt;br /&gt;
*The title was changed to &amp;quot;A Framework for Assessing Municipal High-Growth High-Tech Entrepreneurship Policy&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pdf&amp;gt;File:Egan_(2021)_-_A_Framework_for_Assessing_Municipal_High-Growth_High-Tech_Entrepreneurship_Policy.pdf&amp;lt;/pdf&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BibTeX reference is (pending update with volume and number):&lt;br /&gt;
&lt;br /&gt;
 @article{EGAN2021104292,&lt;br /&gt;
   title = {A framework for assessing municipal high-growth high-technology entrepreneurship policy},&lt;br /&gt;
   journal = {Research Policy},&lt;br /&gt;
   pages = {104292},&lt;br /&gt;
   year = {2021},&lt;br /&gt;
   issn = {0048-7333},&lt;br /&gt;
   doi = {https://doi.org/10.1016/j.respol.2021.104292},&lt;br /&gt;
   url = {https://www.sciencedirect.com/science/article/pii/S0048733321000937},&lt;br /&gt;
   author = {Edward J. Egan},&lt;br /&gt;
   keywords = {Entrepreneurship, Ecosystem, Measurement, High-growth high-technology, Venture capital, Ecosystem support organization, Pipeline, Raise rate, Policy cartel},&lt;br /&gt;
   abstract = {This paper advances a framework for making rudimentary need, impact, and cost–benefit assessments of municipal high-growth high-tech entrepreneurship policy. The framework views ecosystem support organizations like accelerators, incubators, and hubs as components in a city’s venture pipeline. A component’s pipeline size, raise rate, and cost per raise measure its performance. In total, the framework consists of eight objective and reproducible measures based on quantities and qualities of venture capital investment and 16 definitions of related terms-of-the-art. These measures and definitions are illustrated in 26 real-world policy examples, which assess initiatives in Houston and St. Louis over the last 20 years. The examples reveal an enormous variation in welfare effects, and some policies appear welfare destroying. Many non-profit organizations claim success (and win awards and acclaim) using non-standard measures despite performing at less than half benchmark levels. Policy cartels, which control startup policy in many U.S. cities, also engage in non-market actions to protect their rents.}&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The final file series was v4-6-2 in:&lt;br /&gt;
 E:\projects\MeasuringHGHTEcosystems&lt;br /&gt;
 /bulk/vcdb4&lt;br /&gt;
 Egan (2021) - A Framework for Assessing Municipal High-Growth High-Tech Entrepreneurship Policy.pdf&lt;br /&gt;
&lt;br /&gt;
Production files (sent to ResPol):&lt;br /&gt;
*MeasuringHGHTEntrepreneurshipEcosystemsV4-6-2.tex&lt;br /&gt;
*MeasuringHGHTEntrepreneurshipEcosystemsV4-6-2-TitlePage.tex&lt;br /&gt;
*References.bib&lt;br /&gt;
*HoustonPipelineV4.png&lt;br /&gt;
*HoustonVCRaiseRateWithBenchmarkV4.png&lt;br /&gt;
*econ.bst&lt;br /&gt;
&lt;br /&gt;
==Notice==&lt;br /&gt;
&lt;br /&gt;
The original Measuring HGHT Entrepreneurship Ecosystems paper was broken into two:&lt;br /&gt;
*:'''A Framework for Assessing Municipal High-Growth High-Technology Entrepreneurship Policy''' now contains the definitions, measures, and examples. It is an inductive, case-study paper.&lt;br /&gt;
*[[Determinants of Future Investment in U.S. Startup Cities]]: The empirical analysis of ESOs is now in this paper!&lt;br /&gt;
&lt;br /&gt;
==Research Policy Special Issue==&lt;br /&gt;
&lt;br /&gt;
This paper is for a Special Issue of Research Policy, organized by/for the UMM grant cohort. The deadline for submission is Nov 30th, 2019. See: https://www.journals.elsevier.com/research-policy/call-for-papers/uncommon-methods-and-metrics&lt;br /&gt;
&lt;br /&gt;
Examples of questions that papers could address are:&lt;br /&gt;
#What fundamental constructs or elements might constitute a theory or theoretical base for the geographically defined entrepreneurial ecosystem?&lt;br /&gt;
#What are general definitions of entrepreneurial ecosystems so that entrepreneurial ecosystems can be measured in a consistent way across all sectors?&lt;br /&gt;
#What key relationships need to be captured at the entrepreneurial ecosystem level?&lt;br /&gt;
#How should the impact of local entrepreneurial ecosystems on economic growth at the national level be measured?&lt;br /&gt;
#Whose performance (and what) should be measured? Should researchers look at a mix of short- and long-term measures? &lt;br /&gt;
#Do existing rankings for entrepreneurial ecosystems measure what they claim to measure?&lt;br /&gt;
#To what extent are entrepreneurial ecosystems and innovation related?&lt;br /&gt;
#What are the salient levels of analysis (e.g., cultural, institutional, spatial) to consider when analyzing entrepreneurial behavior?How do the characteristics of entrepreneurial ecosystems vary by country?&lt;br /&gt;
#By which mechanisms do entrepreneurial ecosystems get established, mature, decline, or get renewed?&lt;br /&gt;
#What are the trade-offs between attracting entrepreneurs to a city, and solving urban problems such as affordable housing?&lt;br /&gt;
#Under what circumstances could a university be considered an ecosystem, and how does this interact with entrepreneurial ecosystems?What are more finely grained evaluations of the effectiveness of policy instruments that capture connections and ties across entrepreneurial ecosystems?&lt;br /&gt;
#To what extent is government policy accelerating or inhibiting the progress of entrepreneurial ecosystems?&lt;br /&gt;
&lt;br /&gt;
==Data and Analysis==&lt;br /&gt;
&lt;br /&gt;
The paper uses [[VCDB20]] and [[US Startup City Ranking]], as well as a wealth of old McNair material. Sources include (copied to the project folder unless otherwise noted):&lt;br /&gt;
*[[Hubs]]: Hubs Data v2_'16.xlsx&lt;br /&gt;
*[[Federal Grant Data]], including NIH, NSF and other grant data, especially SBIR/STTR. Possibly also contract data. &lt;br /&gt;
*[[Urban Start-up Agglomeration and Venture Capital Investment|Agglomeration]], including the locality indicators, and [[American Community Survey (ACS) Data]]&lt;br /&gt;
*Market vs. non-Market? E:\mcnair\Projects\Houston\MarketNonMarket&lt;br /&gt;
*Location of VCs (foriegn vs. domestic, local vs. not, etc.) E:\mcnair\Projects\Houston\Houston Ecosystem Recommendations\2017ReportV1.xlsx&lt;br /&gt;
*Pipeline and raise rate for Houston: E:\mcnair\Projects\Houston\Acc Rank (IB) -- moved to subfolder pipeline&lt;br /&gt;
*[[U.S. Seed Accelerators]], and also other source material for [[Determinants of Seed Accelerator Performance: The Horse, the Jockey, and the Racetrack]]. Likely just the load of the file to rule them all...&lt;br /&gt;
*[[Incubator Seed Data]]&lt;br /&gt;
*Carnegie Classifications of Institutes of Higher Education (see [[University Patents]]). A new public data file was downloaded from http://carnegieclassifications.iu.edu/ and put in the folder (CCIHE2018-PublicData.xlsx).&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=Economic_definition_of_true_love&amp;diff=48668</id>
		<title>Economic definition of true love</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=Economic_definition_of_true_love&amp;diff=48668"/>
		<updated>2024-10-04T14:59:49Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* The Pitt-Depp Addendum */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Preamble==&lt;br /&gt;
&lt;br /&gt;
I originally tried to write an [[economic definition of true love]] for Valentine's Day in 2009 on a page entitled &amp;quot;Dating Ed&amp;quot;. It became one of the most popular pages on my website, receiving hundreds of thousands of views, and I maintained it across several different wikis. The version below no longer includes information about dating me, as I'm now married, but does bring back some other material that was deleted over the years.  &lt;br /&gt;
&lt;br /&gt;
==Definition of True Love==&lt;br /&gt;
&lt;br /&gt;
Let &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt; denote the set of all entities (perhaps Humans, though we might also include dogs, cats and horses, according to historical precedent).&lt;br /&gt;
&lt;br /&gt;
Let &amp;lt;math&amp;gt;T&amp;lt;/math&amp;gt; denote the set of pairs of individuals who have True Love, such that:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\forall\{i,j\} \in T: \quad (i \succ_j h \quad \forall h \ne i) \wedge (j \succ_i h \quad \forall h \ne j), \quad h \in H \cup \{\emptyset\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that:&lt;br /&gt;
*The definition employs strict preferences. A polyamorous definition might allow weak preferences instead.&lt;br /&gt;
*The union with the empty set allows for people who would rather be alone (e.g. Liz Lemon/Tina Fey), provided that we allow a mild abuse of notation so that &amp;lt;math&amp;gt;\{\emptyset\} \succ_{i} h&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==The Existence of True Love==&lt;br /&gt;
&lt;br /&gt;
Can we prove that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt; ?&lt;br /&gt;
&lt;br /&gt;
===The Brad Pitt Problem===&lt;br /&gt;
&lt;br /&gt;
Rational preferences aren't sufficient to guarantee that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
'''Proof:'''&lt;br /&gt;
&lt;br /&gt;
Recall that a preference relation is rational if it is complete and transitive:&lt;br /&gt;
#Completeness: &amp;lt;math&amp;gt;\forall x,y \in X: \quad x \succsim y \;\lor\; y \succsim x&amp;lt;/math&amp;gt;&lt;br /&gt;
#Transitivity: &amp;lt;math&amp;gt;\forall x,y,z \in X: \quad \mbox{if}\; \; x \succsim y \;\wedge\; y \succsim z \;\mbox{then}\; x \succsim z&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Also recall the definition of the strict preference relation:&lt;br /&gt;
:&amp;lt;math&amp;gt;x \succ y \quad \Leftrightarrow \quad x \succsim y \;\wedge\; y \not{\succsim} x&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then suppose:&lt;br /&gt;
&lt;br /&gt;
#&amp;lt;math&amp;gt;\forall j \ne i \in H \quad i \succ_j h \quad \forall h\ne i \in H\quad\mbox{(Everyone loves Brad)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;\{\emptyset\} \succ_i h \quad \forall h \in H\quad\mbox{(Brad would rather be alone)}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then &amp;lt;math&amp;gt;T = \{\emptyset\}&amp;lt;/math&amp;gt;  Q.E.D.&lt;br /&gt;
&lt;br /&gt;
===The Pitt-Depp Addendum===&lt;br /&gt;
&lt;br /&gt;
Adding the constraint that 'everybody loves somebody', or equivalently that:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\forall i \in H \quad \exists h \in H \;\mbox{s.t. }\; h \succ_i \{\emptyset\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
does not make rational preferences sufficient to guarantee that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
'''Proof''':&lt;br /&gt;
&lt;br /&gt;
Suppose:&lt;br /&gt;
#&amp;lt;math&amp;gt;\forall k \ne i,j \in H \quad i \succ_j h \quad \forall h\ne i,k \in H\quad\mbox{(Everyone, except Johnny, loves Brad)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;j \succ_i h \quad \forall h\ne j \in H\quad\mbox{(Brad loves Johnny)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;\exists h' \ne i,j \; \mbox{s.t.}\; h'\succ_j h \quad \forall h\ne h',i \in H\quad\mbox{(Johnny loves his wife)}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then &amp;lt;math&amp;gt;T = \{\emptyset\}&amp;lt;/math&amp;gt;  Q.E.D.&lt;br /&gt;
&lt;br /&gt;
Note: Objections to this proof on the grounds of the inclusion of Johnny Depp should be addressed to [https://scholar.harvard.edu/rabin/capital-montana Matthew Rabin].&lt;br /&gt;
&lt;br /&gt;
Additional Note: The claim that [https://en.wikipedia.org/wiki/Depp_v._Heard Johnny loves his wife hasn't aged well]. This should be changed to Johnny loves [https://en.wikipedia.org/wiki/Vanessa_Paradis French Actress and Singer Vanessa Paradis], his longest romantic partner and mother to his two children, as the odds of him doing better are now approaching zero.&lt;br /&gt;
&lt;br /&gt;
==The Age Rule==&lt;br /&gt;
&lt;br /&gt;
The defacto standard age rule is as follows:&lt;br /&gt;
&lt;br /&gt;
Denote two people &amp;lt;math&amp;gt;i\;&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;j\;&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;Age_i \le Age_j&amp;lt;/math&amp;gt;, then it is acceptable for them to date if and only if &lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;Age_i \ge \max \left\{\left(\frac{Age_j}{2}\right)+7\;,\;\underline{Age}\right\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;\underline{Age} = 18 \;\mbox{if}\; Age_j \ge 18&amp;lt;/math&amp;gt;, except in Utah.&lt;br /&gt;
&lt;br /&gt;
I finally found a source to attribute this to: XKCD predates my posting significantly with its [http://xkcd.com/314/ 'Standard Creepiness Rule'].&lt;br /&gt;
&lt;br /&gt;
==Random Love==&lt;br /&gt;
&lt;br /&gt;
An amusing exploration of Random Love was recently posted as [http://what-if.xkcd.com/9/ XKCD Blog article No. 9].&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=Economic_definition_of_true_love&amp;diff=48667</id>
		<title>Economic definition of true love</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=Economic_definition_of_true_love&amp;diff=48667"/>
		<updated>2024-10-04T14:47:24Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* The Pitt-Depp Addendum */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Preamble==&lt;br /&gt;
&lt;br /&gt;
I originally tried to write an [[economic definition of true love]] for Valentine's Day in 2009 on a page entitled &amp;quot;Dating Ed&amp;quot;. It became one of the most popular pages on my website, receiving hundreds of thousands of views, and I maintained it across several different wikis. The version below no longer includes information about dating me, as I'm now married, but does bring back some other material that was deleted over the years.  &lt;br /&gt;
&lt;br /&gt;
==Definition of True Love==&lt;br /&gt;
&lt;br /&gt;
Let &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt; denote the set of all entities (perhaps Humans, though we might also include dogs, cats and horses, according to historical precedent).&lt;br /&gt;
&lt;br /&gt;
Let &amp;lt;math&amp;gt;T&amp;lt;/math&amp;gt; denote the set of pairs of individuals who have True Love, such that:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\forall\{i,j\} \in T: \quad (i \succ_j h \quad \forall h \ne i) \wedge (j \succ_i h \quad \forall h \ne j), \quad h \in H \cup \{\emptyset\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that:&lt;br /&gt;
*The definition employs strict preferences. A polyamorous definition might allow weak preferences instead.&lt;br /&gt;
*The union with the empty set allows for people who would rather be alone (e.g. Liz Lemon/Tina Fey), provided that we allow a mild abuse of notation so that &amp;lt;math&amp;gt;\{\emptyset\} \succ_{i} h&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==The Existence of True Love==&lt;br /&gt;
&lt;br /&gt;
Can we prove that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt; ?&lt;br /&gt;
&lt;br /&gt;
===The Brad Pitt Problem===&lt;br /&gt;
&lt;br /&gt;
Rational preferences aren't sufficient to guarantee that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
'''Proof:'''&lt;br /&gt;
&lt;br /&gt;
Recall that a preference relation is rational if it is complete and transitive:&lt;br /&gt;
#Completeness: &amp;lt;math&amp;gt;\forall x,y \in X: \quad x \succsim y \;\lor\; y \succsim x&amp;lt;/math&amp;gt;&lt;br /&gt;
#Transitivity: &amp;lt;math&amp;gt;\forall x,y,z \in X: \quad \mbox{if}\; \; x \succsim y \;\wedge\; y \succsim z \;\mbox{then}\; x \succsim z&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Also recall the definition of the strict preference relation:&lt;br /&gt;
:&amp;lt;math&amp;gt;x \succ y \quad \Leftrightarrow \quad x \succsim y \;\wedge\; y \not{\succsim} x&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then suppose:&lt;br /&gt;
&lt;br /&gt;
#&amp;lt;math&amp;gt;\forall j \ne i \in H \quad i \succ_j h \quad \forall h\ne i \in H\quad\mbox{(Everyone loves Brad)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;\{\emptyset\} \succ_i h \quad \forall h \in H\quad\mbox{(Brad would rather be alone)}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then &amp;lt;math&amp;gt;T = \{\emptyset\}&amp;lt;/math&amp;gt;  Q.E.D.&lt;br /&gt;
&lt;br /&gt;
===The Pitt-Depp Addendum===&lt;br /&gt;
&lt;br /&gt;
Adding the constraint that 'everybody loves somebody', or equivalently that:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\forall i \in H \quad \exists h \in H \;\mbox{s.t. }\; h \succ_i \{\emptyset\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
does not make rational preferences sufficient to guarantee that &amp;lt;math&amp;gt; T \ne \{\emptyset\}&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
'''Proof''':&lt;br /&gt;
&lt;br /&gt;
Suppose:&lt;br /&gt;
#&amp;lt;math&amp;gt;\forall k \ne i,j \in H \quad i \succ_j h \quad \forall h\ne i,k \in H\quad\mbox{(Everyone, except Johnny, loves Brad)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;j \succ_i h \quad \forall h\ne j \in H\quad\mbox{(Brad loves Johnny)}&amp;lt;/math&amp;gt;&lt;br /&gt;
#&amp;lt;math&amp;gt;\exists h' \ne i,j \; \mbox{s.t.}\; h'\succ_j h \quad \forall h\ne h',i \in H\quad\mbox{(Johnny loves his wife)}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then &amp;lt;math&amp;gt;T = \{\emptyset\}&amp;lt;/math&amp;gt;  Q.E.D.&lt;br /&gt;
&lt;br /&gt;
Note: Objections to this proof on the grounds of the inclusion of Johnny Depp should be addressed to [https://scholar.harvard.edu/rabin/capital-montana Matthew Rabin].&lt;br /&gt;
Additional Note: The claim that [https://en.wikipedia.org/wiki/Depp_v._Heard Johnny loves his wife hasn't aged well]. This should be changed to Johnny loves himself, or some such.&lt;br /&gt;
&lt;br /&gt;
==The Age Rule==&lt;br /&gt;
&lt;br /&gt;
The defacto standard age rule is as follows:&lt;br /&gt;
&lt;br /&gt;
Denote two people &amp;lt;math&amp;gt;i\;&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;j\;&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;Age_i \le Age_j&amp;lt;/math&amp;gt;, then it is acceptable for them to date if and only if &lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;Age_i \ge \max \left\{\left(\frac{Age_j}{2}\right)+7\;,\;\underline{Age}\right\}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;\underline{Age} = 18 \;\mbox{if}\; Age_j \ge 18&amp;lt;/math&amp;gt;, except in Utah.&lt;br /&gt;
&lt;br /&gt;
I finally found a source to attribute this to: XKCD predates my posting significantly with its [http://xkcd.com/314/ 'Standard Creepiness Rule'].&lt;br /&gt;
&lt;br /&gt;
==Random Love==&lt;br /&gt;
&lt;br /&gt;
An amusing exploration of Random Love was recently posted as [http://what-if.xkcd.com/9/ XKCD Blog article No. 9].&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48666</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48666"/>
		<updated>2024-08-09T22:07:38Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Upgrading the nVidia Drivers */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts' cost is perhaps $4-5k now for a massive update to the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
=====Change the resolution=====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
=====Cut And Paste=====&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue. See, for example, https://unix.stackexchange.com/questions/35030/how-can-i-copy-paste-data-to-and-from-the-windows-clipboard-to-an-opensuse-clipb&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  autocutsel -fork  &lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
Though this might have been working fine anyway. Just change the terminal and all will be well.&lt;br /&gt;
&lt;br /&gt;
=====Use XFCE terminal=====&lt;br /&gt;
&lt;br /&gt;
Change Settings: Preferred Applications -&amp;gt; Utilities -&amp;gt; Terminal to XFCE&lt;br /&gt;
&lt;br /&gt;
Note that this seems to fix everything but the instructions for customizing the menu are here: https://wiki.xfce.org/howto/customize-menu&lt;br /&gt;
 cat /etc/xdg/menus/xfce-applications.menu&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;br /&gt;
&lt;br /&gt;
===Upgrading the nVIDIA Drivers===&lt;br /&gt;
&lt;br /&gt;
In MATLAB, I ran:&lt;br /&gt;
 gpuDevice&lt;br /&gt;
  Error using gpuDevice (line 26)&lt;br /&gt;
  Graphics driver is out of date. Download and install the latest graphics driver for your GPU from NVIDIA.&lt;br /&gt;
&lt;br /&gt;
Some quick checks showed that I was using driver version 430.26 on ubuntu 18.04.02. &lt;br /&gt;
 nvidia-smi&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
I couldn't quite get MATLAB to tell me what I needed:&lt;br /&gt;
* https://www.mathworks.com/help/parallel-computing/gpu-computing-requirements.html&lt;br /&gt;
* https://www.mathworks.com/help/parallel-computing/run-mex-functions-containing-cuda-code.html#mw_20acaa78-994d-4695-ab4b-bca1cfc3dbac&lt;br /&gt;
&lt;br /&gt;
For MEX, I have 10.2 and need 12.2 of the CUDA toolkit:&lt;br /&gt;
 MATLAB Release	CUDA Toolkit Version&lt;br /&gt;
 R2024a	12.2&lt;br /&gt;
 ...&lt;br /&gt;
 R2020b	10.2&lt;br /&gt;
&lt;br /&gt;
However:&lt;br /&gt;
* nVidia said the latest version was https://www.nvidia.com/Download/driverResults.aspx/230357/en-us/&lt;br /&gt;
* The repo said the highest version for 18.04 is 545: https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 runlevel&lt;br /&gt;
  #5&lt;br /&gt;
 systemctl get-default&lt;br /&gt;
  #graphical.target&lt;br /&gt;
 systemctl set-default multi-user.target&lt;br /&gt;
 systemctl reboot&lt;br /&gt;
&lt;br /&gt;
As ed: &lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
	Killing Xtightvnc process ID 1844&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 #sh ./NVIDIA-Linux-x86_64-550.107.02.run&lt;br /&gt;
 # The distribution-provided pre-install script failed!&lt;br /&gt;
 #cat /var/log/nvidia-installer.log&lt;br /&gt;
&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt install nvidia-driver-545&lt;br /&gt;
 systemctl set-default graphical.target&lt;br /&gt;
 systemctl reboot&lt;br /&gt;
&lt;br /&gt;
Run MATLAB&lt;br /&gt;
 gpuDevice&lt;br /&gt;
   Name: 'NVIDIA TITAN RTX'&lt;br /&gt;
                     Index: 1&lt;br /&gt;
         ComputeCapability: '7.5'&lt;br /&gt;
     GraphicsDriverVersion: '545.29.06'&lt;br /&gt;
            ToolkitVersion: 12.2000&lt;br /&gt;
&lt;br /&gt;
 gpuDevice(2)&lt;br /&gt;
                      Name: 'NVIDIA TITAN Xp'&lt;br /&gt;
                     Index: 2&lt;br /&gt;
         ComputeCapability: '6.1'&lt;br /&gt;
            SupportsDouble: 1&lt;br /&gt;
     GraphicsDriverVersion: '545.29.06'&lt;br /&gt;
            ToolkitVersion: 12.2000&lt;br /&gt;
&lt;br /&gt;
The messages were:&lt;br /&gt;
 apt install nvidia-driver-545&lt;br /&gt;
 	The following additional packages will be installed:&lt;br /&gt;
 	  libnvidia-cfg1-545 libnvidia-common-545 libnvidia-compute-545 libnvidia-compute-545:i386 libnvidia-decode-545&lt;br /&gt;
 	  libnvidia-decode-545:i386 libnvidia-encode-545 libnvidia-encode-545:i386 libnvidia-extra-545 libnvidia-fbc1-545&lt;br /&gt;
 	  libnvidia-fbc1-545:i386 libnvidia-gl-545 libnvidia-gl-545:i386 nvidia-compute-utils-545 nvidia-dkms-545&lt;br /&gt;
 	  nvidia-firmware-545-545.29.06 nvidia-kernel-common-545 nvidia-kernel-source-545 nvidia-utils-545&lt;br /&gt;
 	  xserver-xorg-video-nvidia-545&lt;br /&gt;
 	The following packages will be REMOVED:&lt;br /&gt;
 	  libnvidia-cfg1-430 libnvidia-common-430 libnvidia-compute-430 libnvidia-compute-430:i386 libnvidia-decode-430&lt;br /&gt;
 	  libnvidia-decode-430:i386 libnvidia-encode-430 libnvidia-encode-430:i386 libnvidia-fbc1-430 libnvidia-fbc1-430:i386&lt;br /&gt;
 	  libnvidia-gl-430 libnvidia-gl-430:i386 libnvidia-ifr1-430 libnvidia-ifr1-430:i386 nvidia-compute-utils-430 nvidia-dkms-430&lt;br /&gt;
  	  nvidia-driver-430 nvidia-kernel-common-430 nvidia-kernel-source-430 nvidia-utils-430 xserver-xorg-video-nvidia-430&lt;br /&gt;
 	The following NEW packages will be installed:&lt;br /&gt;
 	  libnvidia-cfg1-545 libnvidia-common-545 libnvidia-compute-545 libnvidia-compute-545:i386 libnvidia-decode-545&lt;br /&gt;
 	  libnvidia-decode-545:i386 libnvidia-encode-545 libnvidia-encode-545:i386 libnvidia-extra-545 libnvidia-fbc1-545&lt;br /&gt;
 	  libnvidia-fbc1-545:i386 libnvidia-gl-545 libnvidia-gl-545:i386 nvidia-compute-utils-545 nvidia-dkms-545 nvidia-driver-545&lt;br /&gt;
 	  nvidia-firmware-545-545.29.06 nvidia-kernel-common-545 nvidia-kernel-source-545 nvidia-utils-545&lt;br /&gt;
 	  xserver-xorg-video-nvidia-545&lt;br /&gt;
 	0 upgraded, 21 newly installed, 21 to remove and 2 not upgraded.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48665</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48665"/>
		<updated>2024-08-09T22:06:09Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts' cost is perhaps $4-5k now for a massive update to the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
=====Change the resolution=====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
=====Cut And Paste=====&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue. See, for example, https://unix.stackexchange.com/questions/35030/how-can-i-copy-paste-data-to-and-from-the-windows-clipboard-to-an-opensuse-clipb&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  autocutsel -fork  &lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
Though this might have been working fine anyway. Just change the terminal and all will be well.&lt;br /&gt;
&lt;br /&gt;
=====Use XFCE terminal=====&lt;br /&gt;
&lt;br /&gt;
Change Settings: Preferred Applications -&amp;gt; Utilities -&amp;gt; Terminal to XFCE&lt;br /&gt;
&lt;br /&gt;
Note that this seems to fix everything but the instructions for customizing the menu are here: https://wiki.xfce.org/howto/customize-menu&lt;br /&gt;
 cat /etc/xdg/menus/xfce-applications.menu&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;br /&gt;
&lt;br /&gt;
===Upgrading the nVidia Drivers===&lt;br /&gt;
&lt;br /&gt;
In MATLAB, I ran:&lt;br /&gt;
 gpuDevice&lt;br /&gt;
  Error using gpuDevice (line 26)&lt;br /&gt;
  Graphics driver is out of date. Download and install the latest graphics driver for your GPU from NVIDIA.&lt;br /&gt;
&lt;br /&gt;
Some quick checks showed that I was using driver version 430.26 on ubuntu 18.04.02. &lt;br /&gt;
 nvidia-smi&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
I couldn't quite get MATLAB to tell me what I needed:&lt;br /&gt;
* https://www.mathworks.com/help/parallel-computing/gpu-computing-requirements.html&lt;br /&gt;
* https://www.mathworks.com/help/parallel-computing/run-mex-functions-containing-cuda-code.html#mw_20acaa78-994d-4695-ab4b-bca1cfc3dbac&lt;br /&gt;
&lt;br /&gt;
For MEX, I have 10.2 and need 12.2 of the CUDA toolkit:&lt;br /&gt;
 MATLAB Release	CUDA Toolkit Version&lt;br /&gt;
 R2024a	12.2&lt;br /&gt;
 ...&lt;br /&gt;
 R2020b	10.2&lt;br /&gt;
&lt;br /&gt;
However:&lt;br /&gt;
* nVidia said the latest version was https://www.nvidia.com/Download/driverResults.aspx/230357/en-us/&lt;br /&gt;
* The repo said the highest version for 18.04 is 545: https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 runlevel&lt;br /&gt;
  #5&lt;br /&gt;
 systemctl get-default&lt;br /&gt;
  #graphical.target&lt;br /&gt;
 systemctl set-default multi-user.target&lt;br /&gt;
 systemctl reboot&lt;br /&gt;
&lt;br /&gt;
As ed: &lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
	Killing Xtightvnc process ID 1844&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 #sh ./NVIDIA-Linux-x86_64-550.107.02.run&lt;br /&gt;
 # The distribution-provided pre-install script failed!&lt;br /&gt;
 #cat /var/log/nvidia-installer.log&lt;br /&gt;
&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt install nvidia-driver-545&lt;br /&gt;
 systemctl set-default graphical.target&lt;br /&gt;
 systemctl reboot&lt;br /&gt;
&lt;br /&gt;
Run MATLAB&lt;br /&gt;
 gpuDevice&lt;br /&gt;
 gpuDevice(2)&lt;br /&gt;
&lt;br /&gt;
The messages were:&lt;br /&gt;
 apt install nvidia-driver-545&lt;br /&gt;
 	The following additional packages will be installed:&lt;br /&gt;
 	  libnvidia-cfg1-545 libnvidia-common-545 libnvidia-compute-545 libnvidia-compute-545:i386 libnvidia-decode-545&lt;br /&gt;
 	  libnvidia-decode-545:i386 libnvidia-encode-545 libnvidia-encode-545:i386 libnvidia-extra-545 libnvidia-fbc1-545&lt;br /&gt;
 	  libnvidia-fbc1-545:i386 libnvidia-gl-545 libnvidia-gl-545:i386 nvidia-compute-utils-545 nvidia-dkms-545&lt;br /&gt;
 	  nvidia-firmware-545-545.29.06 nvidia-kernel-common-545 nvidia-kernel-source-545 nvidia-utils-545&lt;br /&gt;
 	  xserver-xorg-video-nvidia-545&lt;br /&gt;
 	The following packages will be REMOVED:&lt;br /&gt;
 	  libnvidia-cfg1-430 libnvidia-common-430 libnvidia-compute-430 libnvidia-compute-430:i386 libnvidia-decode-430&lt;br /&gt;
 	  libnvidia-decode-430:i386 libnvidia-encode-430 libnvidia-encode-430:i386 libnvidia-fbc1-430 libnvidia-fbc1-430:i386&lt;br /&gt;
 	  libnvidia-gl-430 libnvidia-gl-430:i386 libnvidia-ifr1-430 libnvidia-ifr1-430:i386 nvidia-compute-utils-430 nvidia-dkms-430&lt;br /&gt;
  	  nvidia-driver-430 nvidia-kernel-common-430 nvidia-kernel-source-430 nvidia-utils-430 xserver-xorg-video-nvidia-430&lt;br /&gt;
 	The following NEW packages will be installed:&lt;br /&gt;
 	  libnvidia-cfg1-545 libnvidia-common-545 libnvidia-compute-545 libnvidia-compute-545:i386 libnvidia-decode-545&lt;br /&gt;
 	  libnvidia-decode-545:i386 libnvidia-encode-545 libnvidia-encode-545:i386 libnvidia-extra-545 libnvidia-fbc1-545&lt;br /&gt;
 	  libnvidia-fbc1-545:i386 libnvidia-gl-545 libnvidia-gl-545:i386 nvidia-compute-utils-545 nvidia-dkms-545 nvidia-driver-545&lt;br /&gt;
 	  nvidia-firmware-545-545.29.06 nvidia-kernel-common-545 nvidia-kernel-source-545 nvidia-utils-545&lt;br /&gt;
 	  xserver-xorg-video-nvidia-545&lt;br /&gt;
 	0 upgraded, 21 newly installed, 21 to remove and 2 not upgraded.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48664</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48664"/>
		<updated>2024-08-08T22:00:29Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Documentation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts' cost is perhaps $4-5k now for a massive update to the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
=====Change the resolution=====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
=====Cut And Paste=====&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue. See, for example, https://unix.stackexchange.com/questions/35030/how-can-i-copy-paste-data-to-and-from-the-windows-clipboard-to-an-opensuse-clipb&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  autocutsel -fork  &lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
Though this might have been working fine anyway. Just change the terminal and all will be well.&lt;br /&gt;
&lt;br /&gt;
=====Use XFCE terminal=====&lt;br /&gt;
&lt;br /&gt;
Change Settings: Preferred Applications -&amp;gt; Utilities -&amp;gt; Terminal to XFCE&lt;br /&gt;
&lt;br /&gt;
Note that this seems to fix everything but the instructions for customizing the menu are here: https://wiki.xfce.org/howto/customize-menu&lt;br /&gt;
 cat /etc/xdg/menus/xfce-applications.menu&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48663</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48663"/>
		<updated>2024-08-05T19:53:32Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Later Notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
=====Change the resolution=====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
=====Cut And Paste=====&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue. See, for example, https://unix.stackexchange.com/questions/35030/how-can-i-copy-paste-data-to-and-from-the-windows-clipboard-to-an-opensuse-clipb&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  autocutsel -fork  &lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
Though this might have been working fine anyway. Just change the terminal and all will be well.&lt;br /&gt;
&lt;br /&gt;
=====Use XFCE terminal=====&lt;br /&gt;
&lt;br /&gt;
Change Settings: Preferred Applications -&amp;gt; Utilities -&amp;gt; Terminal to XFCE&lt;br /&gt;
&lt;br /&gt;
Note that this seems to fix everything but the instructions for customizing the menu are here: https://wiki.xfce.org/howto/customize-menu&lt;br /&gt;
 cat /etc/xdg/menus/xfce-applications.menu&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48661</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48661"/>
		<updated>2024-08-02T22:06:06Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Later Notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue. See, for example, https://unix.stackexchange.com/questions/35030/how-can-i-copy-paste-data-to-and-from-the-windows-clipboard-to-an-opensuse-clipb&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  autocutsel -fork  &lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48660</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48660"/>
		<updated>2024-08-02T22:00:33Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Later Notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue, as root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
&lt;br /&gt;
vi ~/.vnc/xstartup&lt;br /&gt;
 #!/bin/bash&lt;br /&gt;
 xrdb $HOME/.Xresources&lt;br /&gt;
 autocutsel -fork  &lt;br /&gt;
 startxfce4 &amp;amp;&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48659</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48659"/>
		<updated>2024-08-02T21:58:18Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Later Notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
Also, try to fix the cut-and-paste issue, as root:&lt;br /&gt;
 apt-get install autocutsel&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48656</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48656"/>
		<updated>2024-08-02T21:24:17Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Later Notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
See https://www.tightvnc.com/vncserver.1.php&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48655</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48655"/>
		<updated>2024-08-02T21:03:03Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Connection Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
====Later Notes====&lt;br /&gt;
&lt;br /&gt;
I came back and changed the resolution to make it work on one of my portrait desktop monitors.&lt;br /&gt;
&lt;br /&gt;
As root:&lt;br /&gt;
 vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  Change line:&lt;br /&gt;
   ExecStart=/usr/bin/vncserver -depth 24 -geometry 1440x2560 :%i&lt;br /&gt;
  (Note that the size is 2160x3840 divide by 150%). Leave the color depth as it says elsewhere that changes are bad.&lt;br /&gt;
 systemctl daemon-reload&lt;br /&gt;
 systemctl enable vncserver@2.service&lt;br /&gt;
&lt;br /&gt;
As Ed:&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=MediaWiki:Sidebar&amp;diff=48650</id>
		<title>MediaWiki:Sidebar</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=MediaWiki:Sidebar&amp;diff=48650"/>
		<updated>2024-08-02T20:45:17Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;* navigation&lt;br /&gt;
&lt;br /&gt;
*Sites&lt;br /&gt;
** mainpage|Wiki&lt;br /&gt;
** http://www.edegan.com/articles|Articles&lt;br /&gt;
&lt;br /&gt;
*Sections&lt;br /&gt;
** Projects|Projects&lt;br /&gt;
** Paper Development|Papers in Development&lt;br /&gt;
** Article Summaries|Paper Reviews&lt;br /&gt;
** Team member|Team Members&lt;br /&gt;
** U.S. Federal Legislation|Legislation&lt;br /&gt;
** Infrastructure|Research Computing&lt;br /&gt;
&lt;br /&gt;
*Organizations&lt;br /&gt;
** Kauffman Incubator Project|Incubator Project&lt;br /&gt;
** McNair Center|McNair Center&lt;br /&gt;
** BPP|Berkeley's BPP Group&lt;br /&gt;
** NBER Patent Data |NBER Patent Data&lt;br /&gt;
&lt;br /&gt;
*Help&lt;br /&gt;
** Help:General |General help&lt;br /&gt;
** Help:Team |Team help&lt;br /&gt;
** Help:Administration | Administration&lt;br /&gt;
** Special:BatchUpload|Batch Upload Files&lt;br /&gt;
* SEARCH&lt;br /&gt;
* TOOLBOX&lt;br /&gt;
* LANGUAGES&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48649</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48649"/>
		<updated>2024-08-01T20:42:50Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Other Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;br /&gt;
&lt;br /&gt;
Note that when I came back to the box the launcher didn't work...&lt;br /&gt;
&lt;br /&gt;
==== MATLAB ====&lt;br /&gt;
&lt;br /&gt;
I installed MATLAB R2024a by downloading the zip, running&lt;br /&gt;
 sudo ./install&lt;br /&gt;
&lt;br /&gt;
and using the defaults of /usr/local/MATLAB/R2024 etc. The license number is 41201644.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48648</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48648"/>
		<updated>2024-08-01T20:26:08Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* VNC */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user (ailia)&lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48647</id>
		<title>DIGITS DevBox</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=DIGITS_DevBox&amp;diff=48647"/>
		<updated>2024-08-01T20:25:18Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Connection Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo &amp;quot;Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built&amp;quot;] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec].&lt;br /&gt;
&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
===Specification===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Documentation===&lt;br /&gt;
&lt;br /&gt;
The documentation from NVIDIA is here:&lt;br /&gt;
*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html&lt;br /&gt;
*https://developer.nvidia.com/devbox&lt;br /&gt;
*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf&lt;br /&gt;
&lt;br /&gt;
However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. &lt;br /&gt;
The best instructions that I could find were:&lt;br /&gt;
*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460&lt;br /&gt;
&lt;br /&gt;
The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs [https://lambdalabs.com/deep-learning/workstations/4-gpu], Bizon-tech [https://bizon-tech.com/us/bizon-g3000], are selling variants on them, but their prices are high too and the details on their specs are limited (the MoBo and config details are missing entirely).&lt;br /&gt;
&lt;br /&gt;
But the parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get it up and running.&lt;br /&gt;
&lt;br /&gt;
==Hardware==&lt;br /&gt;
&lt;br /&gt;
===Description===&lt;br /&gt;
&lt;br /&gt;
We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.&lt;br /&gt;
&lt;br /&gt;
[[File:Front1000.jpg|right|300px]] We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K. We had both available and both support 40 channels, mount in the LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs, we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.&lt;br /&gt;
&lt;br /&gt;
===Parts List===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! Quantity !! Part&lt;br /&gt;
|-&lt;br /&gt;
| 1 || ASUS X99-E WS/USB 3.1 LGA 2011-v3 Intel X99 SATA 6Gb/s USB 3.1 USB 3.0 CEB Intel Motherboard&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Intel Haswell Xeon e5-2620v3, 6 core @ 2.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011-v3&lt;br /&gt;
|-&lt;br /&gt;
| 8 || Crucial DDR4 RDIMM, 2133Mhz , Registered (buffered) and ECC, 32GB&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN RTX DirectX 12 900-1G150-2500-000 SB 24GB 384-Bit GDDR6 HDCP Ready Video Card&lt;br /&gt;
|-&lt;br /&gt;
| 1 || NVIDIA TITAN Xp Graphics Card (900-1G611-2530-000)&lt;br /&gt;
|-&lt;br /&gt;
| 1 || SAMSUNG 970 EVO PLUS 500GB Internal Solid State Drive (SSD) MZ-V7S500B/AM&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500/EU)&lt;br /&gt;
|-&lt;br /&gt;
| 3 || WD Red 4TB NAS Hard Disk Drive - 5400 RPM Class SATA 6Gb/s 64MB Cache 3.5 Inch - WD40EFRX&lt;br /&gt;
|-&lt;br /&gt;
| 1 || DVDRW: Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST&lt;br /&gt;
|-&lt;br /&gt;
| 1 || EVGA SuperNOVA 1600 T2 220-T2-1600-X1 80+ TITANIUM 1600W Fully Modular EVGA ECO Mode Power Supply&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-L4000 - 4U Rackmount Server Case / Chassis - 8 Internal Bays, 7 Cooling Fans Included&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RSV-SATA-Cage-34 - Hard Disk Drives - Black, 3 x 5.25&amp;quot; to 4 x 3.5&amp;quot; Hot-Swap - SATA III / SAS - Cage&lt;br /&gt;
|-&lt;br /&gt;
| 1 || Rosewill RDRD-11003 2.5&amp;quot; SSD / HDD Mounting Kit for 3.5&amp;quot; Drive Bay w/ 60mm Fan&lt;br /&gt;
|-&lt;br /&gt;
| 3 || Corsair ML120 PRO LED CO-9050043-WW 120mm Blue LED 120mm Premium Magnetic Levitation PWM Fan&lt;br /&gt;
|-&lt;br /&gt;
| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Build notes===&lt;br /&gt;
&lt;br /&gt;
Old notes on a prior look at a [[GPU Build]] are on the wiki too.&lt;br /&gt;
&lt;br /&gt;
[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. We put the Titan Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which we expect will get heavier use) are in the clear for now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible.&lt;br /&gt;
&lt;br /&gt;
===BIOS===&lt;br /&gt;
&lt;br /&gt;
The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.&lt;br /&gt;
&lt;br /&gt;
We did NOT update the BIOS. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.&lt;br /&gt;
&lt;br /&gt;
We then made the following changes:&lt;br /&gt;
*Set the three hard disks to hot-swap enable&lt;br /&gt;
*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to matter, they seem to be idling at around 1k)&lt;br /&gt;
*List the OS as &amp;quot;Other OS&amp;quot; rather than windows, and set enhanced mode to disabled&lt;br /&gt;
*Delete the PK to disable secure boot&lt;br /&gt;
*Change the boot order to be CD first (not as UEFI, and then the Samsung 850)&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
*We will do RAID 5 array in software, rather using X99 through the BIOS&lt;br /&gt;
&lt;br /&gt;
What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.&lt;br /&gt;
&lt;br /&gt;
==Software==&lt;br /&gt;
&lt;br /&gt;
===Main OS Install===&lt;br /&gt;
&lt;br /&gt;
Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end.&lt;br /&gt;
&lt;br /&gt;
====In the installer====&lt;br /&gt;
&lt;br /&gt;
Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router.&lt;br /&gt;
&lt;br /&gt;
Under partitions: &lt;br /&gt;
[[File:Partitions1000.jpg|right|300px]] &lt;br /&gt;
# Put one large partition, formatted as ext4, mounted as /, bootable on the 850&lt;br /&gt;
# Partition each SATA drive as RAID&lt;br /&gt;
# Put one large partition, formatted as ext4, not mounted on the 970 (for later)&lt;br /&gt;
# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk&lt;br /&gt;
&lt;br /&gt;
Install SSH and Samba. When prompted, add the MBR to the front of the 850.&lt;br /&gt;
&lt;br /&gt;
====First boot====&lt;br /&gt;
&lt;br /&gt;
After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now.&lt;br /&gt;
&lt;br /&gt;
Run as root:&lt;br /&gt;
 apt-get update&lt;br /&gt;
 apt-get dist-upgrade&lt;br /&gt;
 apt-get install --install-recommends linux-generic-hwe-18.04 &lt;br /&gt;
&lt;br /&gt;
Check the release:&lt;br /&gt;
 lsb_release -a&lt;br /&gt;
&lt;br /&gt;
Give the box a reboot!&lt;br /&gt;
&lt;br /&gt;
===X Windows===&lt;br /&gt;
&lt;br /&gt;
If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is:&lt;br /&gt;
 tasksel&lt;br /&gt;
  And choose your favorite. We used Ubuntu Desktop.&lt;br /&gt;
&lt;br /&gt;
And reboot again to make sure that everything is working nicely.&lt;br /&gt;
&lt;br /&gt;
===Video Drivers===&lt;br /&gt;
&lt;br /&gt;
The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information.&lt;br /&gt;
&lt;br /&gt;
===Hardware and Drivers===&lt;br /&gt;
&lt;br /&gt;
Check the hardware is being seen and what driver is being used with:&lt;br /&gt;
  lspci -vk&lt;br /&gt;
&lt;br /&gt;
Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX.&lt;br /&gt;
&lt;br /&gt;
You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended:&lt;br /&gt;
 apt-get install ubuntu-drivers-common&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free recommended&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
But the 390 is the only driver available from the main repo. Add the experimental repo for more options:&lt;br /&gt;
&lt;br /&gt;
 add-apt-repository ppa:graphics-drivers/ppa&lt;br /&gt;
 apt update&lt;br /&gt;
 ubuntu-drivers devices&lt;br /&gt;
  == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 ==&lt;br /&gt;
  modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00&lt;br /&gt;
  vendor   : NVIDIA Corporation&lt;br /&gt;
  model    : GP102 [TITAN Xp]&lt;br /&gt;
  driver   : nvidia-driver-418 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-415 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-430 - third-party free recommended&lt;br /&gt;
  driver   : nvidia-driver-396 - third-party free&lt;br /&gt;
  driver   : nvidia-driver-390 - distro non-free&lt;br /&gt;
  driver   : nvidia-driver-410 - third-party free&lt;br /&gt;
  driver   : xserver-xorg-video-nouveau - distro free builtin&lt;br /&gt;
&lt;br /&gt;
Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. &lt;br /&gt;
&lt;br /&gt;
 apt-get install build-essential&lt;br /&gt;
 gcc --version&lt;br /&gt;
 vi /etc/modprobe.d/blacklist-nouveau.conf&lt;br /&gt;
  blacklist nouveau&lt;br /&gt;
  options nouveau modeset=0&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
  Reboot to a text terminal&lt;br /&gt;
 lspci -vk&lt;br /&gt;
  Shows no kernel driver in use!&lt;br /&gt;
&lt;br /&gt;
Install the driver!&lt;br /&gt;
&lt;br /&gt;
 apt install nvidia-driver-430&lt;br /&gt;
&lt;br /&gt;
====CUDA====&lt;br /&gt;
&lt;br /&gt;
Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway.&lt;br /&gt;
&lt;br /&gt;
*The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html&lt;br /&gt;
*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&amp;amp;target_arch=x86_64&amp;amp;target_distro=Ubuntu&amp;amp;target_version=1804&amp;amp;target_type=runfilelocal&lt;br /&gt;
Essentially, first install build-essential, which gets you gcc. &lt;br /&gt;
&lt;br /&gt;
Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!):&lt;br /&gt;
 sh cuda_10.0.130_410.48_linux.run&lt;br /&gt;
&lt;br /&gt;
 	Do you accept the previously read EULA?&lt;br /&gt;
 	accept/decline/quit: accept&lt;br /&gt;
 &lt;br /&gt;
 	Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: n&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Toolkit?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter Toolkit Location&lt;br /&gt;
 	 [ default is /usr/local/cuda-10.0 ]:&lt;br /&gt;
 &lt;br /&gt;
 	Do you want to install a symbolic link at /usr/local/cuda?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Install the CUDA 10.0 Samples?&lt;br /&gt;
 	(y)es/(n)o/(q)uit: y&lt;br /&gt;
 &lt;br /&gt;
 	Enter CUDA Samples Location&lt;br /&gt;
 	 [ default is /home/ed ]:&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...&lt;br /&gt;
 	Missing recommended library: libGLU.so&lt;br /&gt;
 	Missing recommended library: libX11.so&lt;br /&gt;
 	Missing recommended library: libXi.so&lt;br /&gt;
 	Missing recommended library: libXmu.so&lt;br /&gt;
 	Missing recommended library: libGL.so&lt;br /&gt;
 &lt;br /&gt;
 	Installing the CUDA Samples in /home/ed ...&lt;br /&gt;
 	Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now...&lt;br /&gt;
 	Finished copying samples.&lt;br /&gt;
 &lt;br /&gt;
 	===========&lt;br /&gt;
 	= Summary =&lt;br /&gt;
 	===========&lt;br /&gt;
 &lt;br /&gt;
 	Driver:   Not Selected&lt;br /&gt;
 	Toolkit:  Installed in /usr/local/cuda-10.0&lt;br /&gt;
 	Samples:  Installed in /home/ed, but missing recommended libraries&lt;br /&gt;
 &lt;br /&gt;
 	Please make sure that&lt;br /&gt;
 	 -   PATH includes /usr/local/cuda-10.0/bin&lt;br /&gt;
 	 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root&lt;br /&gt;
 &lt;br /&gt;
 	To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin&lt;br /&gt;
 &lt;br /&gt;
 	Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.&lt;br /&gt;
 &lt;br /&gt;
 	***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required &lt;br /&gt;
 for CUDA 10.0 functionality to work.&lt;br /&gt;
 	To install the driver using this installer, run the following command, replacing &amp;lt;CudaInstaller&amp;gt; with the name of this run file:&lt;br /&gt;
 	    sudo &amp;lt;CudaInstaller&amp;gt;.run -silent -driver&lt;br /&gt;
 &lt;br /&gt;
 	Logfile is /tmp/cuda_install_2807.log&lt;br /&gt;
&lt;br /&gt;
Now fix the paths. To do this for a single user do:&lt;br /&gt;
 export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}}&lt;br /&gt;
 export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;br /&gt;
&lt;br /&gt;
But it is better to fix it for everyone by editing your environment file:&lt;br /&gt;
 vi /etc/environment&lt;br /&gt;
  PATH=&amp;quot;/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games&amp;quot;&lt;br /&gt;
  LD_LIBRARY_PATH=&amp;quot;/usr/local/cuda-10.0/lib64&amp;quot;&lt;br /&gt;
&lt;br /&gt;
With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon:&lt;br /&gt;
 /usr/bin/nvidia-persistenced --verbose&lt;br /&gt;
&lt;br /&gt;
Instead, nvidia-persistenced runs as a service. &lt;br /&gt;
&lt;br /&gt;
====Test the installation====&lt;br /&gt;
&lt;br /&gt;
Make the samples...&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples&lt;br /&gt;
 make&lt;br /&gt;
 &lt;br /&gt;
And change into the sample directory and run the tests:&lt;br /&gt;
&lt;br /&gt;
 cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release&lt;br /&gt;
 ./deviceQuery&lt;br /&gt;
 ./bandwidthTest &lt;br /&gt;
&lt;br /&gt;
Everything should be good at this point!&lt;br /&gt;
&lt;br /&gt;
===Bcache===&lt;br /&gt;
&lt;br /&gt;
The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache:&lt;br /&gt;
 apt-get install bcache-tools&lt;br /&gt;
 It was already installed and the newest version&lt;br /&gt;
&lt;br /&gt;
See what we have:&lt;br /&gt;
 fdisk -l&lt;br /&gt;
&lt;br /&gt;
This gives us:&lt;br /&gt;
*/dev/nvme0n1p1  m.2&lt;br /&gt;
*/dev/sda RAID disk&lt;br /&gt;
*/dev/sdb RAID disk&lt;br /&gt;
*/dev/sdc RAID disk&lt;br /&gt;
*/dev/md0 RAID array&lt;br /&gt;
*/dev/sdd 870&lt;br /&gt;
&lt;br /&gt;
The m.2 is not mounted. This can be seen by checking lsblk (or mount or df):&lt;br /&gt;
 lsblk&lt;br /&gt;
 NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT&lt;br /&gt;
 sda           8:0    0   3.7T  0 disk&lt;br /&gt;
 └─sda1        8:1    0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdb           8:16   0   3.7T  0 disk&lt;br /&gt;
 └─sdb1        8:17   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdc           8:32   0   3.7T  0 disk&lt;br /&gt;
 └─sdc1        8:33   0   3.7T  0 part&lt;br /&gt;
   └─md0       9:0    0   7.3T  0 raid5 /bulk&lt;br /&gt;
 sdd           8:48   0 465.8G  0 disk&lt;br /&gt;
 └─sdd1        8:49   0 465.8G  0 part  /&lt;br /&gt;
 sr0          11:0    1  1024M  0 rom&lt;br /&gt;
 nvme0n1     259:0    0 465.8G  0 disk&lt;br /&gt;
 └─nvme0n1p1 259:1    0 465.8G  0 part&lt;br /&gt;
&lt;br /&gt;
Check the mdadm.conf file and fstab:&lt;br /&gt;
 cat /etc/mdadm/mdadm.conf&lt;br /&gt;
  ...&lt;br /&gt;
  ARRAY /dev/md/0  metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0&lt;br /&gt;
 &lt;br /&gt;
 cat /etc/fstab&lt;br /&gt;
  UUID=475ad41e-3d64-4c90-8fbc-9289c050acea /               ext4    errors=remount-ro 0 1&lt;br /&gt;
  UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk           ext4    defaults 0 2&lt;br /&gt;
  /swapfile                                 none            swap    sw 0 0&lt;br /&gt;
&lt;br /&gt;
Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together:&lt;br /&gt;
 blkid /dev/md0&lt;br /&gt;
 /dev/md0: UUID=&amp;quot;aa65554a-24d9-450a-b10c-63c5c6a4b48a&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note we have an active RAID5 array:&lt;br /&gt;
 cat /proc/mdstat&lt;br /&gt;
&lt;br /&gt;
Instructions for taking apart and/or (re-)creating a RAID array are here:&lt;br /&gt;
*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
Instructions on building a bcache are here:&lt;br /&gt;
*https://wiki.ubuntu.com/ServerTeam/Bcache&lt;br /&gt;
*https://www.kernel.org/doc/Documentation/bcache.txt&lt;br /&gt;
&lt;br /&gt;
Unmount the RAID array:&lt;br /&gt;
 umount /dev/md0&lt;br /&gt;
&lt;br /&gt;
Wipe the both m.2 and the RAID5 array:&lt;br /&gt;
 wipefs -a /dev/nvme0n1p1&lt;br /&gt;
 wipefs -a /dev/md0&lt;br /&gt;
&lt;br /&gt;
Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic.&lt;br /&gt;
 make-bcache -B /dev/md0 -C /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 &amp;gt; stop. This unregisters the cache and should let you start over.&lt;br /&gt;
&lt;br /&gt;
Check the new bcache array is there, format it and mount it:&lt;br /&gt;
 ls /dev/bcache*&lt;br /&gt;
 mkfs.ext4 /dev/bcache0&lt;br /&gt;
 mount /dev/bcache0 /bulk&lt;br /&gt;
&lt;br /&gt;
Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec:&lt;br /&gt;
 blkid /dev/bcache0&lt;br /&gt;
   UUID=&amp;quot;4c63f20b-ad35-477d-bfaa-82571beba841&amp;quot; TYPE=&amp;quot;ext4&amp;quot;&lt;br /&gt;
 cp /etc/fstab /etc/fstab.org&lt;br /&gt;
 vi /etc/fstab&lt;br /&gt;
  Comment out old RAID array entry&lt;br /&gt;
  Add new entry:&lt;br /&gt;
   UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0&lt;br /&gt;
&lt;br /&gt;
And update your boot image and give it a reboot to check the new bcache array comes back up ok:&lt;br /&gt;
 update-initramfs -u&lt;br /&gt;
 shutdown -r now&lt;br /&gt;
&lt;br /&gt;
===Samba===&lt;br /&gt;
&lt;br /&gt;
These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux&lt;br /&gt;
&lt;br /&gt;
Check samba is running&lt;br /&gt;
 samba --version&lt;br /&gt;
&lt;br /&gt;
Then fix the conf file:&lt;br /&gt;
 cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;br /&gt;
 vi /etc/samba/smb.conf&lt;br /&gt;
 	workgroup=BASTARDGROUP&lt;br /&gt;
  	usershare allow guests = no&lt;br /&gt;
 	;comment out the [printers] and [print$] sections&lt;br /&gt;
     &lt;br /&gt;
 	[bulk]&lt;br /&gt;
 	comment = Bulk RAID Array&lt;br /&gt;
 	path = /bulk&lt;br /&gt;
 	browseable = yes&lt;br /&gt;
 	create mask= 0775&lt;br /&gt;
 	directory mask = 0775&lt;br /&gt;
 	read only = no&lt;br /&gt;
 	guest ok = no&lt;br /&gt;
&lt;br /&gt;
Test the parameters, change the permissions and ownership:&lt;br /&gt;
 testparm /etc/samba/smb.conf&lt;br /&gt;
 chmod 770 /bulk&lt;br /&gt;
 groupadd smbusers&lt;br /&gt;
 chown :smbusers /bulk&lt;br /&gt;
&lt;br /&gt;
Now create the researcher account, and add it to the samba share group&lt;br /&gt;
 cat /etc/group&lt;br /&gt;
 groupadd -g 1002 researcher&lt;br /&gt;
 useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m &lt;br /&gt;
 researcher&lt;br /&gt;
 passwd researcher&lt;br /&gt;
 	hint: littleamount&lt;br /&gt;
 smbpasswd -a researcher&lt;br /&gt;
&lt;br /&gt;
Finally restart samba:&lt;br /&gt;
 systemctl restart smbd&lt;br /&gt;
 systemctl restart nmbd&lt;br /&gt;
&lt;br /&gt;
Check it works:&lt;br /&gt;
 smbclient -L localhost&lt;br /&gt;
 (no root password)&lt;br /&gt;
&lt;br /&gt;
And add users to the samba group (if not already):&lt;br /&gt;
 usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just:&lt;br /&gt;
  useradd ed smbusers&lt;br /&gt;
&lt;br /&gt;
===Dev Tools===&lt;br /&gt;
&lt;br /&gt;
====DIGITS====&lt;br /&gt;
&lt;br /&gt;
This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/&lt;br /&gt;
&lt;br /&gt;
Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0&lt;br /&gt;
 ...&lt;br /&gt;
 sudo apt-get install -y nvidia-docker2&lt;br /&gt;
 sudo pkill -SIGHUP dockerd&lt;br /&gt;
 # Test nvidia-smi with the latest official CUDA image&lt;br /&gt;
 docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi&lt;br /&gt;
&lt;br /&gt;
Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/):&lt;br /&gt;
 docker pull nvidia/digits&lt;br /&gt;
&lt;br /&gt;
Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options):&lt;br /&gt;
 docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits&lt;br /&gt;
&lt;br /&gt;
And open a browser to http://localhost:5000/ to see DIGITS.&lt;br /&gt;
&lt;br /&gt;
Documentation:&lt;br /&gt;
*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md&lt;br /&gt;
*https://developer.nvidia.com/digits&lt;br /&gt;
&lt;br /&gt;
Note: you can kill docker containers with&lt;br /&gt;
 docker system prune&lt;br /&gt;
 &lt;br /&gt;
====cuDNN====&lt;br /&gt;
&lt;br /&gt;
Documentation on installing cuDNN is here:&lt;br /&gt;
*https://developer.nvidia.com/cuDNN&lt;br /&gt;
*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html&lt;br /&gt;
&lt;br /&gt;
First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then:&lt;br /&gt;
 cd /bulk/install/&lt;br /&gt;
 dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
 dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb&lt;br /&gt;
&lt;br /&gt;
And test it:&lt;br /&gt;
 cp -r /usr/src/cudnn_samples_v7/ $HOME&lt;br /&gt;
 cd  $HOME/cudnn_samples_v7/mnistCUDNN&lt;br /&gt;
 make clean &amp;amp;&amp;amp; make&lt;br /&gt;
 ./mnistCUDNN&lt;br /&gt;
  Test passed!&lt;br /&gt;
&lt;br /&gt;
====Python Based====&lt;br /&gt;
&lt;br /&gt;
Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script.&lt;br /&gt;
&lt;br /&gt;
From https://www.anaconda.com/distribution/ the latest version is 3.7, so:&lt;br /&gt;
 cd /bulk/install&lt;br /&gt;
 curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
 sha256sum Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
&lt;br /&gt;
As user researcher, run the installation (this installs python 3.7.3):&lt;br /&gt;
 bash Anaconda3-2019.03-Linux-x86_64.sh&lt;br /&gt;
  accept the install location: /home/researcher/anaconda3&lt;br /&gt;
  accept the initialization by running conda init&lt;br /&gt;
 Flush the local env:&lt;br /&gt;
  source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=====Tensorflow=====&lt;br /&gt;
&lt;br /&gt;
Now install tensorflow using pip (see https://www.tensorflow.org/install/pip):&lt;br /&gt;
 As root:&lt;br /&gt;
  apt install python3-pip&lt;br /&gt;
  apt install virtualenv&lt;br /&gt;
  pip3 install -U virtualenv&lt;br /&gt;
&lt;br /&gt;
 As researcher:&lt;br /&gt;
  cd /home/researcher&lt;br /&gt;
  virtualenv --system-site-packages -p python3 ./venv&lt;br /&gt;
  source ./venv/bin/activate  # sh, bash, ksh, or zsh&lt;br /&gt;
  pip install --upgrade tensorflow-gpu&lt;br /&gt;
  python -c &amp;quot;import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Note: to deactivate the virtual environment:&lt;br /&gt;
 deactivate&lt;br /&gt;
&lt;br /&gt;
Note that adding the anaconda path to /etc/environment makes the virtual environment redundant.&lt;br /&gt;
&lt;br /&gt;
=====PyTorch and SciKit=====&lt;br /&gt;
&lt;br /&gt;
Run the following as researcher (in venv):&lt;br /&gt;
 conda install -c anaconda numpy&lt;br /&gt;
 conda install pytorch torchvision cudatoolkit=10.0 -c pytorch&lt;br /&gt;
 conda install -c anaconda scikit-learn&lt;br /&gt;
&lt;br /&gt;
Refs:&lt;br /&gt;
*https://anaconda.org/anaconda/scikit-learn&lt;br /&gt;
*https://anaconda.org/anaconda/numpy&lt;br /&gt;
*https://pytorch.org/&lt;br /&gt;
&lt;br /&gt;
====Other packages====&lt;br /&gt;
&lt;br /&gt;
The following are not yet installed:&lt;br /&gt;
*Caffe: http://caffe.berkeleyvision.org/&lt;br /&gt;
*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running&lt;br /&gt;
&lt;br /&gt;
=====Theano=====&lt;br /&gt;
&lt;br /&gt;
Theano v.1 requires python &amp;gt;=3.4 and &amp;lt;3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:&lt;br /&gt;
*http://deeplearning.net/software/theano/install_ubuntu.html&lt;br /&gt;
&lt;br /&gt;
===VNC===&lt;br /&gt;
&lt;br /&gt;
In order to use the graphical interface for Matlab and other applications, we need a VNC server. &lt;br /&gt;
&lt;br /&gt;
First, install the VNC client remotely. We use the standalone exe from TigerVNC. &lt;br /&gt;
&lt;br /&gt;
Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04&lt;br /&gt;
&lt;br /&gt;
 cd /root&lt;br /&gt;
 apt-get install xfce4 xfce4-goodies&lt;br /&gt;
&lt;br /&gt;
As user &lt;br /&gt;
 sudo apt-get install tightvncserver&lt;br /&gt;
 vncserver&lt;br /&gt;
  set password for user &lt;br /&gt;
 vncserver -kill :1&lt;br /&gt;
 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak&lt;br /&gt;
 vi ~/.vnc/xstartup&lt;br /&gt;
  #!/bin/bash&lt;br /&gt;
  xrdb $HOME/.Xresources&lt;br /&gt;
  startxfce4 &amp;amp;&lt;br /&gt;
 vncserver&lt;br /&gt;
 sudo vi /etc/systemd/system/vncserver@.service&lt;br /&gt;
  [Unit]&lt;br /&gt;
  Description=Start TightVNC server at startup&lt;br /&gt;
  After=syslog.target network.target  &lt;br /&gt;
  &lt;br /&gt;
  [Service]&lt;br /&gt;
  Type=forking&lt;br /&gt;
  User=uname&lt;br /&gt;
  Group=uname&lt;br /&gt;
  WorkingDirectory=/home/uname&lt;br /&gt;
  &lt;br /&gt;
  PIDFile=/home/ed/.vnc/%H:%i.pid&lt;br /&gt;
  ExecStartPre=-/usr/bin/vncserver -kill :%i &amp;gt; /dev/null 2&amp;gt;&amp;amp;1&lt;br /&gt;
  ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i&lt;br /&gt;
  ExecStop=/usr/bin/vncserver -kill :%i&lt;br /&gt;
  &lt;br /&gt;
  [Install]&lt;br /&gt;
  WantedBy=multi-user.target&lt;br /&gt;
&lt;br /&gt;
Note that changing the color depth breaks it!&lt;br /&gt;
&lt;br /&gt;
To make changes (or after the edit)&lt;br /&gt;
 sudo systemctl daemon-reload&lt;br /&gt;
 sudo systemctl enable vncserver@2.service&lt;br /&gt;
 vncserver -kill :2&lt;br /&gt;
 sudo systemctl start vncserver@2&lt;br /&gt;
 sudo systemctl status vncserver@2&lt;br /&gt;
&lt;br /&gt;
Stop the server with&lt;br /&gt;
 sudo systemctl stop vncserver@2&lt;br /&gt;
&lt;br /&gt;
Note that we are using :2 because :1 is running our regular Xwindows GUI.&lt;br /&gt;
&lt;br /&gt;
Instrucions on how to set up an IP tunnel using PuTTY:&lt;br /&gt;
 https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/&lt;br /&gt;
&lt;br /&gt;
====Connection Issues====&lt;br /&gt;
&lt;br /&gt;
Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused.&lt;br /&gt;
&lt;br /&gt;
I checked it was listening and there was no firewall:&lt;br /&gt;
 netstat -tlpn&lt;br /&gt;
  tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      2025/Xtightvnc&lt;br /&gt;
 ufw status&lt;br /&gt;
  Status: inactive&lt;br /&gt;
&lt;br /&gt;
The localhost port seems to be open and listening just fine: &lt;br /&gt;
 Test-NetConnection 127.0.0.1 -p 5901&lt;br /&gt;
&lt;br /&gt;
So, presumably, there must be something wrong with the tunnel itself.&lt;br /&gt;
&lt;br /&gt;
'''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.'''&lt;br /&gt;
&lt;br /&gt;
Exit full screen with ctrl-alt-shift-f.&lt;br /&gt;
&lt;br /&gt;
===RDP===&lt;br /&gt;
&lt;br /&gt;
I also installed xrdp:&lt;br /&gt;
 apt install xrdp&lt;br /&gt;
 adduser xrdp ssl-cert&lt;br /&gt;
 #Check the status and that it is listening on 3389&lt;br /&gt;
 systemctl status xrd&lt;br /&gt;
 netstat -tln&lt;br /&gt;
  #It is listening... &lt;br /&gt;
 vi /etc/xrdp/xrdp.ini&lt;br /&gt;
  #See https://linux.die.net/man/5/xrdp.ini&lt;br /&gt;
 systemctl restart xrdp&lt;br /&gt;
&lt;br /&gt;
This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said &amp;quot;login successful for display 10, start connecting, connection problems, giving up, some problem.&amp;quot;&lt;br /&gt;
  cat /var/log/xrdp-sesman.log&lt;br /&gt;
&lt;br /&gt;
There could be some conflict between VNC and RDP. systemctl status xrdp shows &amp;quot;xrdp_wm_log_msg: connection problem, giving up&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
I tried without success:&lt;br /&gt;
 gsettings set org.gnome.Vino require-encryption false&lt;br /&gt;
  https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp&lt;br /&gt;
 vi /etc/X11/Xwrapper.config&lt;br /&gt;
  allowed_users = anybody&lt;br /&gt;
  This was promising as it was previously set to consol.&lt;br /&gt;
  https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508&lt;br /&gt;
 apt-get install xorgxrdp-hwe-18.04&lt;br /&gt;
  Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running&lt;br /&gt;
  https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/&lt;br /&gt;
 dpkg -l |grep xserver-xorg-core&lt;br /&gt;
  ii  xserver-xorg-core                          2:1.19.6-1ubuntu4.3                          amd64        Xorg X server - core server&lt;br /&gt;
  Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972&lt;br /&gt;
&lt;br /&gt;
There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC!&lt;br /&gt;
 apt remove xrdp&lt;br /&gt;
&lt;br /&gt;
===Other Software===&lt;br /&gt;
&lt;br /&gt;
I installed the community edition of PyCharm:&lt;br /&gt;
 snap install pycharm-community --classic&lt;br /&gt;
  #Restart the local terminal so that it has updated paths (after a snap install, etc.)&lt;br /&gt;
 /snap/pycharm-community/214/bin/pycharm.sh&lt;br /&gt;
&lt;br /&gt;
On launch, you get some config options. I chose to install and enable:&lt;br /&gt;
*IdeaVim (a VI editor emulator)&lt;br /&gt;
*R&lt;br /&gt;
*AWS Toolkit&lt;br /&gt;
&lt;br /&gt;
Make a launcher: In /usr/share/applications: &lt;br /&gt;
 vi pycharm.desktop&lt;br /&gt;
  [Desktop Entry]&lt;br /&gt;
  Version=2020.2.3&lt;br /&gt;
  Type=Application&lt;br /&gt;
  Name=PyCharm&lt;br /&gt;
  Icon=/snap/pycharm-community/214/bin/pycharm.png&lt;br /&gt;
  Exec=&amp;quot;/snap/pycharm-community/214/bin/pycharm.sh&amp;quot; %f&lt;br /&gt;
  Comment=The Drive to Develop&lt;br /&gt;
  Categories=Development;IDE;&lt;br /&gt;
  Terminal=false&lt;br /&gt;
  StartupWMClass=jetbrains-pycharm&lt;br /&gt;
&lt;br /&gt;
Also, create a launcher on the desktop with the same info.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=Super_Secret_Passwords&amp;diff=48646</id>
		<title>Super Secret Passwords</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=Super_Secret_Passwords&amp;diff=48646"/>
		<updated>2024-07-15T20:59:42Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{#set: Visible to::whitelist|Visible to group::team}}&lt;br /&gt;
&lt;br /&gt;
webserver user: alexjiang&lt;br /&gt;
&lt;br /&gt;
password: Another1Key2Success&lt;br /&gt;
&lt;br /&gt;
MySQL root password: doesnthaveaphd&lt;br /&gt;
&lt;br /&gt;
How I de/reconfigure the network interface:&lt;br /&gt;
&lt;br /&gt;
 $ sudo ifdown p3p1&lt;br /&gt;
 $ sudo ifup p3p1&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48644</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48644"/>
		<updated>2024-06-13T19:37:03Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]] or the [[McNair Center]] build, which was called [[VentureXpert Data]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;br /&gt;
&lt;br /&gt;
===Industry===&lt;br /&gt;
&lt;br /&gt;
The industry coding is in IndustryCodes.txt. Note that:&lt;br /&gt;
* The code map had to be updated. The Excel file is in projects/vcdb24 but I didn't have the original counts to hand.&lt;br /&gt;
* The codes are not unique (576 out of 585 are unique) at the industry subgroup 3 level.&lt;br /&gt;
* The codes (code, code20, code100) should be joined using indclass, indminorgroup, indsubgroup&lt;br /&gt;
* The codes are 1,2,4 but not 3dg hierarchical.&lt;br /&gt;
* 1dg codes are IT, LS, and Other. &lt;br /&gt;
* Note that code is the full 4dg industry identifier, where as code20 and code100 are name-based aggregates with at least 20 or 100 observations in them.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|-&lt;br /&gt;
! 2dg Code&lt;br /&gt;
! Minorcode&lt;br /&gt;
! No. of PortCos&lt;br /&gt;
|-&lt;br /&gt;
| 11&lt;br /&gt;
| Communications and Media&lt;br /&gt;
| 3930&lt;br /&gt;
|-&lt;br /&gt;
| 12&lt;br /&gt;
| Computer Hardware&lt;br /&gt;
| 3058&lt;br /&gt;
|-&lt;br /&gt;
| 13&lt;br /&gt;
| Computer Software and Services&lt;br /&gt;
| 21157&lt;br /&gt;
|-&lt;br /&gt;
| 14&lt;br /&gt;
| Internet Specific&lt;br /&gt;
| 14440&lt;br /&gt;
|-&lt;br /&gt;
| 15&lt;br /&gt;
| Semiconductors/Other Elect.&lt;br /&gt;
| 3145&lt;br /&gt;
|-&lt;br /&gt;
| 21&lt;br /&gt;
| Biotechnology&lt;br /&gt;
| 4251&lt;br /&gt;
|-&lt;br /&gt;
| 22&lt;br /&gt;
| Medical/Health&lt;br /&gt;
| 7138&lt;br /&gt;
|-&lt;br /&gt;
| 31&lt;br /&gt;
| Consumer Related&lt;br /&gt;
| 7459&lt;br /&gt;
|-&lt;br /&gt;
| 32&lt;br /&gt;
| Industrial/Energy&lt;br /&gt;
| 7028&lt;br /&gt;
|-&lt;br /&gt;
| 33&lt;br /&gt;
| Other Products&lt;br /&gt;
| 14246&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
I also tried some keyword industry coding from both short and long descriptions. The source code is at the top of BuildBaseTables.sql. The results are in sheets in the IndustryCodes.xlsx file.&lt;br /&gt;
&lt;br /&gt;
===BuildBaseTables.sql===&lt;br /&gt;
&lt;br /&gt;
Build the PortCoGrowthGeoId table that codes the city-state to a geoid.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|- style=&amp;quot;vertical-align:bottom;&amp;quot;&lt;br /&gt;
! origin&lt;br /&gt;
! count&lt;br /&gt;
! Method&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 1&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 45,111&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | Address is geocoded and in tiger place&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 2&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 270&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | city, statecode matches to only 1 geoid, so use it&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 3&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 1,374&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | city, statecode matches to multiple geoids, use the most popular&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 4&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 964&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | 1:1 straight city&amp;lt;-&amp;gt;place and statecode match with tiger&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 5&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 509&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | Use zctaplaceinfo to lookup the best place choice for the zipcode&lt;br /&gt;
|- style=&amp;quot;vertical-align:bottom;&amp;quot;&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 6&lt;br /&gt;
| 636&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | Unable to code&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;vertical-align:middle; color:#808080;&amp;quot; | 9&lt;br /&gt;
| style=&amp;quot;vertical-align:bottom;&amp;quot; | 24&lt;br /&gt;
| style=&amp;quot;color:#808080;&amp;quot; | Custom coded&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===StartupCities===&lt;br /&gt;
&lt;br /&gt;
The original Startup Cities code is in E:\projects\BayesianStartupCities\V1\startupcities.sql. The new version is in e:\projects\BayesianStartupCities\StartupCitiesV2.sql.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48643</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48643"/>
		<updated>2024-06-13T19:00:26Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Industry */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]] or the [[McNair Center]] build, which was called [[VentureXpert Data]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;br /&gt;
&lt;br /&gt;
==Industry==&lt;br /&gt;
&lt;br /&gt;
The industry coding is in IndustryCodes.txt. Note that:&lt;br /&gt;
* The code map had to be updated. The Excel file is in projects/vcdb24 but I didn't have the original counts to hand.&lt;br /&gt;
* The codes are not unique (576 out of 585 are unique) at the industry subgroup 3 level.&lt;br /&gt;
* The codes (code, code20, code100) should be joined using indclass, indminorgroup, indsubgroup&lt;br /&gt;
* The codes are 1,2,4 but not 3dg hierarchical.&lt;br /&gt;
* 1dg codes are IT, LS, and Other. &lt;br /&gt;
* Note that code is the full 4dg industry identifier, where as code20 and code100 are name-based aggregates with at least 20 or 100 observations in them.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|-&lt;br /&gt;
! 2dg Code&lt;br /&gt;
! Minorcode&lt;br /&gt;
! No. of PortCos&lt;br /&gt;
|-&lt;br /&gt;
| 11&lt;br /&gt;
| Communications and Media&lt;br /&gt;
| 3930&lt;br /&gt;
|-&lt;br /&gt;
| 12&lt;br /&gt;
| Computer Hardware&lt;br /&gt;
| 3058&lt;br /&gt;
|-&lt;br /&gt;
| 13&lt;br /&gt;
| Computer Software and Services&lt;br /&gt;
| 21157&lt;br /&gt;
|-&lt;br /&gt;
| 14&lt;br /&gt;
| Internet Specific&lt;br /&gt;
| 14440&lt;br /&gt;
|-&lt;br /&gt;
| 15&lt;br /&gt;
| Semiconductors/Other Elect.&lt;br /&gt;
| 3145&lt;br /&gt;
|-&lt;br /&gt;
| 21&lt;br /&gt;
| Biotechnology&lt;br /&gt;
| 4251&lt;br /&gt;
|-&lt;br /&gt;
| 22&lt;br /&gt;
| Medical/Health&lt;br /&gt;
| 7138&lt;br /&gt;
|-&lt;br /&gt;
| 31&lt;br /&gt;
| Consumer Related&lt;br /&gt;
| 7459&lt;br /&gt;
|-&lt;br /&gt;
| 32&lt;br /&gt;
| Industrial/Energy&lt;br /&gt;
| 7028&lt;br /&gt;
|-&lt;br /&gt;
| 33&lt;br /&gt;
| Other Products&lt;br /&gt;
| 14246&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
I also tried some keyword industry coding from both short and long descriptions. The source code is at the top of BuildBaseTables.sql. The results are in sheets in the IndustryCodes.xlsx file.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48642</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48642"/>
		<updated>2024-06-10T21:30:44Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Industry */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]] or the [[McNair Center]] build, which was called [[VentureXpert Data]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;br /&gt;
&lt;br /&gt;
==Industry==&lt;br /&gt;
&lt;br /&gt;
The industry coding is in IndustryCodes.txt. Note that:&lt;br /&gt;
* The code map had to be updated. The Excel file is in projects/vcdb24 but I didn't have the original counts to hand.&lt;br /&gt;
* The codes are not unique (576 out of 585 are unique) at the industry subgroup 3 level.&lt;br /&gt;
* The codes (code, code20, code100) should be joined using indclass, indminorgroup, indsubgroup&lt;br /&gt;
* The codes are 1,2,4 but not 3dg hierarchical.&lt;br /&gt;
* 1dg codes are IT, LS, and Other. &lt;br /&gt;
* Note that code is the full 4dg industry identifier, where as code20 and code100 are name-based aggregates with at least 20 or 100 observations in them.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|-&lt;br /&gt;
! 2dg Code&lt;br /&gt;
! Minorcode&lt;br /&gt;
! No. of PortCos&lt;br /&gt;
|-&lt;br /&gt;
| 11&lt;br /&gt;
| Communications and Media&lt;br /&gt;
| 3930&lt;br /&gt;
|-&lt;br /&gt;
| 12&lt;br /&gt;
| Computer Hardware&lt;br /&gt;
| 3058&lt;br /&gt;
|-&lt;br /&gt;
| 13&lt;br /&gt;
| Computer Software and Services&lt;br /&gt;
| 21157&lt;br /&gt;
|-&lt;br /&gt;
| 14&lt;br /&gt;
| Internet Specific&lt;br /&gt;
| 14440&lt;br /&gt;
|-&lt;br /&gt;
| 15&lt;br /&gt;
| Semiconductors/Other Elect.&lt;br /&gt;
| 3145&lt;br /&gt;
|-&lt;br /&gt;
| 21&lt;br /&gt;
| Biotechnology&lt;br /&gt;
| 4251&lt;br /&gt;
|-&lt;br /&gt;
| 22&lt;br /&gt;
| Medical/Health&lt;br /&gt;
| 7138&lt;br /&gt;
|-&lt;br /&gt;
| 31&lt;br /&gt;
| Consumer Related&lt;br /&gt;
| 7459&lt;br /&gt;
|-&lt;br /&gt;
| 32&lt;br /&gt;
| Industrial/Energy&lt;br /&gt;
| 7028&lt;br /&gt;
|-&lt;br /&gt;
| 33&lt;br /&gt;
| Other Products&lt;br /&gt;
| 14246&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48641</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48641"/>
		<updated>2024-06-10T21:29:48Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Industry */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]] or the [[McNair Center]] build, which was called [[VentureXpert Data]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;br /&gt;
&lt;br /&gt;
==Industry==&lt;br /&gt;
&lt;br /&gt;
The industry coding is in IndustryCodes.txt. Note that:&lt;br /&gt;
* The code map had to be updated. The Excel file is in projects/vcdb24 but I didn't have the original counts to hand.&lt;br /&gt;
* The codes are not unique (576 out of 585 are unique) at the industry subgroup 3 level.&lt;br /&gt;
* The codes (code, code20, code100) should be joined using indclass, indminorgroup, indsubgroup&lt;br /&gt;
* The codes are 1,2,4 but not 3dg hierarchical.&lt;br /&gt;
* 1dg codes are IT, LS, and Other. &lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|-&lt;br /&gt;
! 2dg Code&lt;br /&gt;
! Minorcode&lt;br /&gt;
! No. of PortCos&lt;br /&gt;
|-&lt;br /&gt;
| 11&lt;br /&gt;
| Communications and Media&lt;br /&gt;
| 3930&lt;br /&gt;
|-&lt;br /&gt;
| 12&lt;br /&gt;
| Computer Hardware&lt;br /&gt;
| 3058&lt;br /&gt;
|-&lt;br /&gt;
| 13&lt;br /&gt;
| Computer Software and Services&lt;br /&gt;
| 21157&lt;br /&gt;
|-&lt;br /&gt;
| 14&lt;br /&gt;
| Internet Specific&lt;br /&gt;
| 14440&lt;br /&gt;
|-&lt;br /&gt;
| 15&lt;br /&gt;
| Semiconductors/Other Elect.&lt;br /&gt;
| 3145&lt;br /&gt;
|-&lt;br /&gt;
| 21&lt;br /&gt;
| Biotechnology&lt;br /&gt;
| 4251&lt;br /&gt;
|-&lt;br /&gt;
| 22&lt;br /&gt;
| Medical/Health&lt;br /&gt;
| 7138&lt;br /&gt;
|-&lt;br /&gt;
| 31&lt;br /&gt;
| Consumer Related&lt;br /&gt;
| 7459&lt;br /&gt;
|-&lt;br /&gt;
| 32&lt;br /&gt;
| Industrial/Energy&lt;br /&gt;
| 7028&lt;br /&gt;
|-&lt;br /&gt;
| 33&lt;br /&gt;
| Other Products&lt;br /&gt;
| 14246&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48640</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48640"/>
		<updated>2024-06-10T21:29:24Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]] or the [[McNair Center]] build, which was called [[VentureXpert Data]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;br /&gt;
&lt;br /&gt;
==Industry==&lt;br /&gt;
&lt;br /&gt;
The industry coding is in IndustryCodes.txt. Note that:&lt;br /&gt;
* The code map had to be updated. The Excel file is in projects/vcdb24 but I didn't have the original counts to hand.&lt;br /&gt;
* The codes are not unique (576 out of 585 are unique) at the industry subgroup 3 level.&lt;br /&gt;
* The codes (code, code20, code100) should be joined using indclass, indminorgroup, indsubgroup&lt;br /&gt;
* The codes are 1,2,4 but not 3dg hierarchical.&lt;br /&gt;
* 1dg codes are IT, LS, and Other. &lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
|-&lt;br /&gt;
! 2dg Code&lt;br /&gt;
! Minorcode&lt;br /&gt;
! No. of PortCos&lt;br /&gt;
|-&lt;br /&gt;
! 11&lt;br /&gt;
! Communications and Media&lt;br /&gt;
! 3930&lt;br /&gt;
|-&lt;br /&gt;
| 12&lt;br /&gt;
| Computer Hardware&lt;br /&gt;
| 3058&lt;br /&gt;
|-&lt;br /&gt;
| 13&lt;br /&gt;
| Computer Software and Services&lt;br /&gt;
| 21157&lt;br /&gt;
|-&lt;br /&gt;
| 14&lt;br /&gt;
| Internet Specific&lt;br /&gt;
| 14440&lt;br /&gt;
|-&lt;br /&gt;
| 15&lt;br /&gt;
| Semiconductors/Other Elect.&lt;br /&gt;
| 3145&lt;br /&gt;
|-&lt;br /&gt;
| 21&lt;br /&gt;
| Biotechnology&lt;br /&gt;
| 4251&lt;br /&gt;
|-&lt;br /&gt;
| 22&lt;br /&gt;
| Medical/Health&lt;br /&gt;
| 7138&lt;br /&gt;
|-&lt;br /&gt;
| 31&lt;br /&gt;
| Consumer Related&lt;br /&gt;
| 7459&lt;br /&gt;
|-&lt;br /&gt;
| 32&lt;br /&gt;
| Industrial/Energy&lt;br /&gt;
| 7028&lt;br /&gt;
|-&lt;br /&gt;
| 33&lt;br /&gt;
| Other Products&lt;br /&gt;
| 14246&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48639</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48639"/>
		<updated>2024-06-07T20:29:53Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;br /&gt;
&lt;br /&gt;
===Exits===&lt;br /&gt;
&lt;br /&gt;
Another part of Load.sql does the matching to IPOs and MAs and their precedence. Note that:&lt;br /&gt;
* Issuer and target names are matched against themselves using the [[Matcher.pl]] script in mode 2&lt;br /&gt;
* PortCo stdnames are matched to issuerstd and targetstd (separately) in mode 0&lt;br /&gt;
* The state and date matching requirements are in the load.sql.&lt;br /&gt;
* There seems to have been duplicate issue records in the IPO data for vcdb2020 (and perhaps earlier). Some of the duplicates are often identical, except that the date is a day apart. &lt;br /&gt;
* The IPO records also contain listings on junior and foreign exchanges, as well as some OTC - I left these in and flagged them.&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48638</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48638"/>
		<updated>2024-06-04T19:26:21Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Geocoding */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;br /&gt;
* Get the latest Gazetteer file(s): https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html&lt;br /&gt;
* Check the coverage of portcogeo and create firmbogeoplus&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48637</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48637"/>
		<updated>2024-06-04T17:04:23Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Geocoding */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires that we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader.txt&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48636</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48636"/>
		<updated>2024-06-04T17:03:54Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Geocoding */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires that we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** ```python Geocode.py portcogrowthneedsgeo-NoHeader.txt```&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48635</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48635"/>
		<updated>2024-06-04T16:02:40Z</updated>

		<summary type="html">&lt;p&gt;Ed: /* Geocoding */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires that we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;br /&gt;
* Log into [https://console.cloud.google.com/ Google Console] and set up an API key. Note that:&lt;br /&gt;
** Up to $200/month should be free&lt;br /&gt;
** $5.00 USD per 1000 lookups. &lt;br /&gt;
** 3,000 QPM max&lt;br /&gt;
* In E:/tools/Geocode run the script(s): Geocode.py for portcos and GeocodeOneKey.py for everything else.&lt;br /&gt;
** Strip the header line out of the input file(s)&lt;br /&gt;
** python Geocode.py portcogrowthneedsgeo-NoHeader-10.txt&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48634</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48634"/>
		<updated>2024-06-04T03:27:13Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;br /&gt;
&lt;br /&gt;
===Geocoding===&lt;br /&gt;
&lt;br /&gt;
Part of Load.sql requires that we update the Geocoding - adding new long and lat for PortCos and firm offices that we haven't seen before.&lt;br /&gt;
&lt;br /&gt;
The last time this was run was vcdb20. Accordingly:&lt;br /&gt;
* In vcdb20, export the portcogeo, firmgeo, and bogeo tables&lt;br /&gt;
* Import them as portcogeo_vcdb20, firmgeo_vcdb20, and bogeo_vcdb20&lt;br /&gt;
* Build portcogrowthneedsgeo, firmneedsgeo, firmboneedsgeo files for geocoding&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB20&amp;diff=48633</id>
		<title>VCDB20</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB20&amp;diff=48633"/>
		<updated>2024-05-31T20:39:52Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|Has project output=Data,Tool,How-to&lt;br /&gt;
|Has title=VCDB20&lt;br /&gt;
|Has owner=Ed Egan&lt;br /&gt;
}}&lt;br /&gt;
&amp;lt;onlyinclude&amp;gt;The [[VCDB20]] project documents a build of my VCDB -- '''V'''enture '''C'''apital '''D'''ata'''B'''ase -- covering until the end of 2020. Each VCDB includes investments, funds, startups, executives, exits, locations, and more, derived from data from [[VentureXpert]]. This project updates [[vcdb4]], which covered (almost) to the of Q3 2019, and replaces [[VCDB20H1]] and [[VCDB20Q3]], which were partial builds. See also: [[SDC Normalizer]].&amp;lt;/onlyinclude&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Data design==&lt;br /&gt;
&lt;br /&gt;
I followed the same data design as in [[VCDB20H1]]. Essentially the specification pulls everything, even things that aren't needed like incomplete M&amp;amp;As or withdrawn IPOs, or funds or investments that aren't venture capital, all the way to the present (the pull was done on 2020-01-04), and then place restrictions on the data later. Crucially, the pulls no longer use the venture-related flag, so the data contains private equity and other deals, as well as secondaries and purchases. Note that the M&amp;amp;As are pulled separately for public and private acquirers, and in chunks by year to keep the request sizes manageable.&lt;br /&gt;
&lt;br /&gt;
==Processing Steps==&lt;br /&gt;
&lt;br /&gt;
===Source Files===&lt;br /&gt;
&lt;br /&gt;
Copy over and update the source files:&lt;br /&gt;
 Name&lt;br /&gt;
 ----&lt;br /&gt;
 NormalizeFixedWidth.pl&lt;br /&gt;
 RoundOnOneLine.pl&lt;br /&gt;
 USFirmBranchOffices1980.rpt&lt;br /&gt;
 USFirmBranchOffices1980.ssh&lt;br /&gt;
 USFirms1980.rpt&lt;br /&gt;
 USFirms1980.ssh&lt;br /&gt;
 USFund1980.rpt&lt;br /&gt;
 USFund1980.ssh&lt;br /&gt;
 USFundExecs1980.rpt&lt;br /&gt;
 USFundExecs1980.ssh&lt;br /&gt;
 USIPO1980.rpt&lt;br /&gt;
 USIPO1980.ssh&lt;br /&gt;
 USMAPrivate.rpt&lt;br /&gt;
 USMAPrivate00-10.ssh&lt;br /&gt;
 USMAPrivate10-.ssh&lt;br /&gt;
 USMAPrivate80-85.ssh&lt;br /&gt;
 USMAPrivate85-00.ssh&lt;br /&gt;
 USMAPublic.rpt&lt;br /&gt;
 USMAPublic00-.ssh&lt;br /&gt;
 USMAPublic80-00.ssh&lt;br /&gt;
 USPortCo1980.rpt&lt;br /&gt;
 USPortCo1980.ssh&lt;br /&gt;
 USPortCoExecs1980.rpt&lt;br /&gt;
 USPortCoExecs1980.ssh&lt;br /&gt;
 USPortCoLongDesc1980.rpt&lt;br /&gt;
 USPortCoLongDesc1980.ssh&lt;br /&gt;
 USRound1980.rpt&lt;br /&gt;
 USRound1980.ssh&lt;br /&gt;
 USRoundOnOneLine1980.rpt&lt;br /&gt;
 USRoundOnOneLine1980.ssh&lt;br /&gt;
&lt;br /&gt;
'''Update the paths and dates in the ssh files''' then run them (see [[SDC Platinum]]).&lt;br /&gt;
&lt;br /&gt;
===Database import===&lt;br /&gt;
&lt;br /&gt;
Run the [[SDC Normalizer]] on each of the files. For most of them, that's straightforward. You can safely ignore the Access Violation error messages that occur at the end of some pulls. However, the following require attention:&lt;br /&gt;
*Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
*Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt&lt;br /&gt;
*The private and public M&amp;amp;A file sets have to be (separately) combined after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t.&lt;br /&gt;
*For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
*The PortCo Long Description needs to be pre-processed from the command line and then post-processed in excel (see [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the dbase as a researcher:&lt;br /&gt;
 createdb vcdb20&lt;br /&gt;
&lt;br /&gt;
Move the files to //mother/bulk/vcdb20 and run the load script:&lt;br /&gt;
 E:\projects\vcdb20\Load.sql&lt;br /&gt;
&lt;br /&gt;
===Create the keys===&lt;br /&gt;
&lt;br /&gt;
Standardize company names using Hall normalization without a fuzzy algorithm (see [[The Matcher (Tool)]]) and matching them to themselves for PortCos, M&amp;amp;As, IPOs. '''It is crucial that the self-matches are made using mode=2, or you won't select a stdname and will generate duplicate entries (from the stdname permutations.'''&lt;br /&gt;
 perl .\Matcher.pl -mode=2 -file1=&amp;quot;DistinctConame.txt&amp;quot; -file2=&amp;quot;DistinctConame.txt&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The keys for each table as follows:&lt;br /&gt;
*PortCo: Coname, statecode, datefirstinv&lt;br /&gt;
*M&amp;amp;A (private targets): targetname, statecode, announceddate&lt;br /&gt;
*IPOs: issuer, statecode, issuedate&lt;br /&gt;
*Fund: Fundname&lt;br /&gt;
*Firm: Firmname&lt;br /&gt;
&lt;br /&gt;
Note that when normalizing Fundexecs use fundname, fundyear as the foreign keys. Generally, do not copy down keys unless they are foreign and blank in the source file. &lt;br /&gt;
&lt;br /&gt;
Also, clean up the round table and create the SEL flags, as well as the PortCoSEL table. And make sure that the end of period date (plus 1) is updated in PortCoAliveDead.&lt;br /&gt;
&lt;br /&gt;
===Add the geocoding===&lt;br /&gt;
&lt;br /&gt;
Load in the old geocoding and create data files for new Google Maps API runs (see [[Geocode.py]] for directions). Note that I separate out Growth and non-growth PortCo addresses so that I can get the growth ones first for the PortCos, and that there are also firm and firm branch office addresses to process (US only). The limit for free is 2,500 calls/day but the cost per call is pretty low.&lt;br /&gt;
&lt;br /&gt;
Note that PortCoGeoID and other tables are built in the BuildBaseTables.sql script.&lt;br /&gt;
&lt;br /&gt;
Change to Load script:&lt;br /&gt;
*Retrieve and load ZCTA Gazetteer from the [https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html U.S. Census]&lt;br /&gt;
*A small but meaningful proportion of US venture firms have zip codes but not addresses. I created the table firmbogeoplus to add longitude and latitudes for these firms in Load.sql (right after firmbogeo).&lt;br /&gt;
&lt;br /&gt;
A small number of US PortCos have zipcodes but not places (which are used in the Rankings and elsewhere). To address this, I loaded the 2010 ZCTA to place lookup in Load.sql (passing the result through TigerGeog) to get the placename. This is now incorporated into PortCoGeoid in BuildBaseTables.sql.&lt;br /&gt;
&lt;br /&gt;
2010 Census data (2020 isn't available but the mappings are fairly static):&lt;br /&gt;
*https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.2010.html&lt;br /&gt;
*https://www2.census.gov/geo/pdfs/maps-data/data/rel/explanation_zcta_place_rel_10.pdf&lt;br /&gt;
*https://www2.census.gov/geo/docs/maps-data/data/rel/zcta_place_rel_10.txt&lt;br /&gt;
&lt;br /&gt;
I also load up the 2010 ANSI codes for places (i.e., statefp, statecode, placename, placefp, etc.): &lt;br /&gt;
* Source: https://www.census.gov/library/reference/code-lists/ansi.html&lt;br /&gt;
* Data: national_places.txt -&amp;gt; national_places_processed.txt (regexes)&lt;br /&gt;
* Pop the last word off the placename!&lt;br /&gt;
&lt;br /&gt;
Note that the postprocessing of these two tables is done at the end of Load.sql. The final table is '''zctaplaceinfo''' which takes zcta, statecode and provides back placename and geoid (of the place).&lt;br /&gt;
&lt;br /&gt;
===Load the Additions===&lt;br /&gt;
&lt;br /&gt;
These include:&lt;br /&gt;
*CPI (see the spreadsheet CPIEstimate.xlsx, source data from https://data.bls.gov/pdq/SurveyOutputServlet)&lt;br /&gt;
*Population, statepop2017, and ACS. See [[American Community Survey (ACS) Data]].&lt;br /&gt;
*tigerplaces, which builds tigergeog, and placedisplay. See [[Jeemin Sim (Work Log)]], [[Urban Start-up Agglomeration and Venture Capital Investment]] and [[Tiger Geocoder]]&lt;br /&gt;
*Industry, Firm type, Firm stage, Title, statecode, and other lookup tables. &lt;br /&gt;
*PortCoSBIR and PortCoPatent -- These are now out of date and don't appear to have build notes.&lt;br /&gt;
&lt;br /&gt;
===Join in the exit data===&lt;br /&gt;
&lt;br /&gt;
Bring in and clean up the IPO and MA (private target) data, match it to the PortCos, and reconcile the conflicts. This creates two core tables, one key table, and two PortCo results tables:&lt;br /&gt;
*IpoCleanNoDups&lt;br /&gt;
*MACleanNoDups&lt;br /&gt;
*ExitKeys (IPO and MA only, not PortCo)&lt;br /&gt;
*PortCoExit&lt;br /&gt;
*PortCoAliveDead&lt;br /&gt;
&lt;br /&gt;
'''PortCoAliveDead''' follows the academic convention of marking a startup as alive if it has begun receiving investment and hasn't exited and if its last investment occurred less than five years ago.&lt;br /&gt;
&lt;br /&gt;
==Base Tables==&lt;br /&gt;
&lt;br /&gt;
The base tables are built using BuildBaseTables.sql. These provide common foundations for:&lt;br /&gt;
*PortCo Geography&lt;br /&gt;
*Other PortCo tables, including industry, id, cpi adjustments, geoids.&lt;br /&gt;
**Exit, Alive/dead and patents &amp;amp; SBIR/STTR grant (note that these need updating) information is also included.&lt;br /&gt;
*Firm variables (note that the full build is only done for firms that make growth investments).&lt;br /&gt;
*Round Line Joiner (RLJoiner) tables.&lt;br /&gt;
*PortCo People, including gender, dr., titles, serials.&lt;br /&gt;
*Fund People: gender and dr.&lt;br /&gt;
*The Master tables:&lt;br /&gt;
**PortCoMaster&lt;br /&gt;
**PortCoPeopleMaster&lt;br /&gt;
**FirmGrowthMaster (the Firm master table, for growth investments)&lt;br /&gt;
**RLMaster&lt;br /&gt;
&lt;br /&gt;
Finally, this code also builds: MatchMostNumerous and MatchHighestRandom, which are used in [[Estimating Unobserved Complementarities between Entrepreneurs and Venture Capitalists]]&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48632</id>
		<title>VCDB24</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VCDB24&amp;diff=48632"/>
		<updated>2024-05-17T22:09:20Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[VCDB24]] is the 2024 and final iteration of my [[VentureXpert]] based '''V'''enture '''C'''apital '''D'''ata'''B'''ase. Thomson-Reuters discontinued access to VentureXpert through [[SDC Platinum]] on December 31st, 2023 (see also: [[SDC Normalizer]]). This iteration contains data up until then. Each VCDB includes investments, funds, startups, executives, exits, locations, and more. The previous build was [[VCDB23]], but the best previous instructions are from [[VCDB20]].&lt;br /&gt;
&lt;br /&gt;
== Processing Steps ==&lt;br /&gt;
&lt;br /&gt;
Get the source data:&lt;br /&gt;
# Copy over the rpt, ssh, and pl files to E:\projects\vcdb24\SDC, and bulk edit the ssh files. &lt;br /&gt;
## Make final date 12/31/2023 and change vcdb23 to vcdb24&lt;br /&gt;
# Run the ssh files against SDC Platinum one last time on 31 December 2023.&lt;br /&gt;
# Run the [[SDC Normalizer]] script (one of the pl files) on each output&lt;br /&gt;
## Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)&lt;br /&gt;
## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt&lt;br /&gt;
## The private and public M&amp;amp;A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.&lt;br /&gt;
## For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.&lt;br /&gt;
## PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see below as well as [[VCDB20H1]] and [[Vcdb4#Long_Description]]).&lt;br /&gt;
&lt;br /&gt;
Create the postgres database:&lt;br /&gt;
# Create a new database on mother (createdb vcdb24) and set up a directory for the input files:  bulk\vcdb24&lt;br /&gt;
# Copy over (to sql folder) and edit Load.sql. Run it section-by-section.&lt;br /&gt;
&lt;br /&gt;
===PortCoLongDescription===&lt;br /&gt;
&lt;br /&gt;
Process the Long Description data as follows:&lt;br /&gt;
#Remove the header and footer, and then save as Process.txt using UNIX line endings and UTF-8 encoding.&lt;br /&gt;
#Run the first section (producing Out5.txt) of the regex process below&lt;br /&gt;
#Import into Excel to make tab-delimited&lt;br /&gt;
#Remove double quotes &amp;quot; from just the description field &lt;br /&gt;
#Put in a new header&lt;br /&gt;
#Save as In5.txt with UNIX/UTF-8&lt;br /&gt;
#Run the last regex. It deals with the spaces in the description and the cases when there is no description.&lt;br /&gt;
#Try importing USVCPortCoLongDesc1980Cleaned.txt. It should be fine.&lt;br /&gt;
&lt;br /&gt;
 cat Process.txt | perl -pe 's/^([^ ])/###\1/g' &amp;gt; Out1.txt&lt;br /&gt;
 cat Out1.txt | perl -pe 's/\s{65,}/ /g' &amp;gt; Out2.txt&lt;br /&gt;
 cat Out2.txt | perl -pe 's/\n//g' &amp;gt; Out3.txt&lt;br /&gt;
 cat Out3.txt | perl -pe 's/###/\n/g' &amp;gt; Out4.txt&lt;br /&gt;
 cat Out4.txt | perl -pe 's/(\d{4} $/\1\t/g' &amp;gt; Out5.txt&lt;br /&gt;
 ...&lt;br /&gt;
 cat In5.txt | perl -pe 's/(\d{4})\t$/\1###/g' &amp;gt; Out6.txt&lt;br /&gt;
 cat Out6.txt | perl -pe 's/\s{2,}/ /g' &amp;gt; Out7.txt&lt;br /&gt;
 cat Out7.txt | perl -pe 's/###/\t/g' &amp;gt; USPortCoLongDesc1980Cleaned.txt&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
	<entry>
		<id>http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48631</id>
		<title>VentureXpert Data</title>
		<link rel="alternate" type="text/html" href="http://www.edegan.com/mediawiki/index.php?title=VentureXpert_Data&amp;diff=48631"/>
		<updated>2024-03-24T16:36:41Z</updated>

		<summary type="html">&lt;p&gt;Ed: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|Has project output=Data,How-to&lt;br /&gt;
|Has sponsor=McNair Center&lt;br /&gt;
|Has title=VentureXpert Data&lt;br /&gt;
|Has owner=Augi Liebster,&lt;br /&gt;
|Has start date=June 20, 2018&lt;br /&gt;
|Has project status=Active&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
The successors to this project include:&lt;br /&gt;
*[[VCDB24]], which is the most recent iteration.&lt;br /&gt;
*[[VCDB23]]&lt;br /&gt;
*[[VCDB20Q3]]&lt;br /&gt;
*[[VCDB20H1]]&lt;br /&gt;
*[[VCDB4]]&lt;br /&gt;
&lt;br /&gt;
==Relevant Former Projects==&lt;br /&gt;
#[[Venture Capital (Data)]]&lt;br /&gt;
#[[Retrieving US VC Data From SDC]]&lt;br /&gt;
#[[VC Database Rebuild]]&lt;br /&gt;
&lt;br /&gt;
==Location==&lt;br /&gt;
My scripts for SDC pulls are located in the E drive in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ScriptsForSDCExtract&lt;br /&gt;
&lt;br /&gt;
My successfully pulled and normalized files are stored in the location:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\ExtractedDataQ2&lt;br /&gt;
&lt;br /&gt;
My scripts for loading tables and data are in:&lt;br /&gt;
 E:\McNair\Projects\VentureXpertDatabase\vcdb3\LoadingScripts&lt;br /&gt;
&lt;br /&gt;
There are a variety of SQL files in there with self explanatory names. The file that has all of the loading scripts is called LoadingScriptsV1. The folder vcdb2 is there for reference to see what people before had done. ExtractedData is there because I pulled data before July 1st, and Ed asked me to repull the data.&lt;br /&gt;
&lt;br /&gt;
==Goal==&lt;br /&gt;
I will be looking to redesign the VentureXpert Database in a way that is more intuitively built than the previous one. I will also update the database with current data.&lt;br /&gt;
&lt;br /&gt;
==Initial Stages==&lt;br /&gt;
The first step of the project was to figure out what primary keys to use for each major table that I create. I looked at the primary keys used in the creation of the [[VC Database Rebuild]] and found primary keys that are decent. I have updated them and list them below:&lt;br /&gt;
&lt;br /&gt;
#CompanyBaseCore- coname, statecode, datefirstinv&lt;br /&gt;
#IPOCore- issuer, issuedate, statecode&lt;br /&gt;
#MACore- target name, target state code, announceddate&lt;br /&gt;
#Geo - city, statecode, coname, datefirst, year&lt;br /&gt;
#DeadDate - conname, statecode, datefirst, rounddate (tentative could still change)&lt;br /&gt;
#RoundCore- conname, statecode, datefirst, rounddate&lt;br /&gt;
#FirmBaseCore - firmname&lt;br /&gt;
#FundBaseCore - fund name (firstinvedate doesn't work because not every row has an entry)&lt;br /&gt;
&lt;br /&gt;
These are my initial listings and I will come back to update them if needed. &lt;br /&gt;
&lt;br /&gt;
The second part of the initial stage has been to pull data from the SDC Platinum platform. I did it in July to ensure that I had two full quarters of data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SDC Pull==&lt;br /&gt;
When pulling data from SDC, it is a good idea to look for previously made rpt files that have the names of the pulls you will need to do. They have already been created and will save you a lot of work. The rpt files that I used are in the folder VentureXpertDB/ScriptsForSDCExtract. The files will come in pairs with one being saved as an ssh file and one as a rpt file. To update the dates to make them recent, go into the ssh file of the pair and change the date of last investment. When you open SDC, you will be given a variety of choices for which database to pull from. For each type of file chose the following:&lt;br /&gt;
&lt;br /&gt;
#VentureXpert - PortCo, PortCoLong, USVC, Firms, BranchOffices, Funds, Rounds, VCFirmLong&lt;br /&gt;
#Mergres &amp;amp; Acquisition - MAs&lt;br /&gt;
#Global New Issues Databases - IPOs&lt;br /&gt;
&lt;br /&gt;
Help on pulling data from SDC is on the [[SDC Platinum (Wiki)]] page. &lt;br /&gt;
&lt;br /&gt;
===VCFund Pull Problem===&lt;br /&gt;
When pulling the VCFund1980-Present, I encountered two problems. One, is that SDC is not able to sort through the funds that are US only with the built in filters. Two, there are multiple rpt files that specify different variables for the fund pull. I pulled from both to be safe, but in the [[VC Database Rebuild]] page there is a section on the fund pull where Ed specifies which rpt file he used to pull data from SDC. Regardless I have both saved in the ExtractedData folder. After speaking with Ed, he told me to use the VCFund1980-present.rpt file to extract the data. Had various problems extracting data including freezing of SDC program or getting error Out of Memory. Check the [[SDC Platinum (Wiki)]] page to fix these issues.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Loading Tables==&lt;br /&gt;
When I describe errors I encountered, I will not describe them using line numbers. This is because as soon as any data is added, the line numbers will become useless. Instead I recommend that you copy the normalized file you are working with into an excel file and using the filter feature. This way you can find the line number in your specific file that is causing errors and fix it in the file itself. The line numbers that PuTTY errors display are often wrong, so I relied on excel to discover the error fastest. If my instructions are not enough for you to find the error, my advice would be to find key words in the line that PuTTY is telling you is causing errors and filter through excel.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundbase;&lt;br /&gt;
 CREATE TABLE roundbase (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   updateddate date,&lt;br /&gt;
   foundingdate date,&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   description varchar(5000),&lt;br /&gt;
   msa varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   indclass varchar(100),&lt;br /&gt;
   indsubgroup3 varchar(100),&lt;br /&gt;
   indminor varchar(100),&lt;br /&gt;
   url varchar(5000),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   stage1 varchar(100),&lt;br /&gt;
   stage3 varchar(100),&lt;br /&gt;
   rndamtdisck real,&lt;br /&gt;
   rndamtestk real,&lt;br /&gt;
   roundnum integer,&lt;br /&gt;
   numinvestors integer&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundbase FROM 'USVC1980-2018q2-Good.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --151549&lt;br /&gt;
&lt;br /&gt;
The only error I encountered here was with Cardtronic Technology Inc. Here there was a problem with a mixture of quotation marks which cause errors in loading. Find this using the excel trick and remove it manually.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipos;&lt;br /&gt;
 CREATE TABLE ipos (&lt;br /&gt;
   issuedate date,&lt;br /&gt;
   issuer varchar(255),&lt;br /&gt;
   statecode varchar(10), &lt;br /&gt;
   principalamt money, --million&lt;br /&gt;
   proceedsamt money, --sum of all markets in million&lt;br /&gt;
   naiccode varchar(255), --primary NAIC code&lt;br /&gt;
   zipcode varchar(10),&lt;br /&gt;
   status varchar (20),&lt;br /&gt;
   foundeddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY ipos FROM 'IPO1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --12107&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchoffices;&lt;br /&gt;
 CREATE TABLE branchoffices (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   bocity varchar(100),&lt;br /&gt;
   bostate varchar(2),&lt;br /&gt;
   bocountrycode varchar(2),&lt;br /&gt;
   bonation varchar(100),&lt;br /&gt;
   bozip varchar(10),&lt;br /&gt;
   boaddr1 varchar(100),&lt;br /&gt;
   boaddr2 varchar(100)&lt;br /&gt;
 &lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY branchoffices FROM 'USVCFirmBranchOffices1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE roundline;&lt;br /&gt;
 CREATE TABLE roundline (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   datelastinv date,&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   rounddate date,&lt;br /&gt;
   disclosedamt real,&lt;br /&gt;
   fundname varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY roundline FROM 'USVCRound1980-2018q2-NoFoot-normal-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --403189&lt;br /&gt;
&lt;br /&gt;
I encountered no errors while loading this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbase;&lt;br /&gt;
 CREATE TABLE fundbase (&lt;br /&gt;
   fundname varchar(255),&lt;br /&gt;
   closedate date, --mm-dd-yyyy&lt;br /&gt;
   lastinvdate date, --mm-dd-yyyy&lt;br /&gt;
   firstinvdate date, --mm-dd-yyyy&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   investedk real,&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   fundyear varchar(4), --yyyy&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   statecode varchar(2),&lt;br /&gt;
   fundsizem real,&lt;br /&gt;
   fundstage varchar(100),&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   dateinfoupdate date,&lt;br /&gt;
   invtype varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   raisestatus varchar(100),&lt;br /&gt;
   seqnum integer,&lt;br /&gt;
   targetsizefund real,&lt;br /&gt;
   fundtype varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundbase FROM 'VCFund1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --29397&lt;br /&gt;
&lt;br /&gt;
There is a Ukranian fund that has stray quotation marks in its name. It is called something along the lines of &amp;quot;VAT &amp;quot;ZNVKIF &amp;quot;Skhidno-Evropeis'lyi investytsiynyi Fond&amp;quot;. If this does not help, you can filter in excel using Kiev as the keyword in the city column and find the line where you are getting errors. Then manually remove the commas in the actual text file. After that, the table should load correctly.&lt;br /&gt;
 DROP TABLE firmbase;&lt;br /&gt;
 CREATE TABLE firmbase(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   foundingdate date, --mm-dd-yyyy&lt;br /&gt;
   datefirstinv date, --mm-dd-yyyy  &lt;br /&gt;
   datelastinv date, --mm-dd-yyyy&lt;br /&gt;
   addr1 varchar(100),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   location varchar(100),&lt;br /&gt;
   city varchar(100),&lt;br /&gt;
   zip varchar(10),&lt;br /&gt;
   areacode integer,&lt;br /&gt;
   county varchar(100),&lt;br /&gt;
   state varchar(2),&lt;br /&gt;
   nationcode varchar(10),&lt;br /&gt;
   nation varchar(100),&lt;br /&gt;
   worldregion varchar(100),&lt;br /&gt;
   numportcos integer,&lt;br /&gt;
   numrounds integer,&lt;br /&gt;
   investedk money,&lt;br /&gt;
   capitalundermgmt money,  &lt;br /&gt;
   invstatus varchar(100),&lt;br /&gt;
   msacode varchar(10),&lt;br /&gt;
   rolepref varchar(100),&lt;br /&gt;
   geogpref varchar(100),&lt;br /&gt;
   indpref varchar(100),&lt;br /&gt;
   stagepref varchar(100),&lt;br /&gt;
   type varchar(100)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 \COPY firmbase FROM 'USVCFirms1980-2018q2-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --15899&lt;br /&gt;
&lt;br /&gt;
The normalization for this file was wrong when I tried to load the data. To fix this go to the file where you have removed the footer and find the column header titled Firm Capital under Mgmt{0Mil}. Delete the {0mil} and renormalize the file. Then everything should be ok. A good way to check this is to copy and paste the normalized file into an excel sheet and see whether the entries line up with their column header correctly. &lt;br /&gt;
The second error I found was with the Kerala Ventures firm. Here the address has the word l&amp;quot;opera in it. This quotation will cause errors so find the line number using excel and remove it manually.&lt;br /&gt;
The third error is in an area code where 1-8 is written. This hyphen causes errors. Interestingly, the line number given by PuTTY was correct, and I found it in my text file and deleted it manually.&lt;br /&gt;
These were the only errors I encountered while loading this table.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE mas;&lt;br /&gt;
 CREATE TABLE mas (&lt;br /&gt;
   announceddate date,&lt;br /&gt;
   effectivedate date,&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   targetstate varchar(100),&lt;br /&gt;
   acquirorname varchar(255),&lt;br /&gt;
   acquirorstate varchar(100),&lt;br /&gt;
   transactionamt money,&lt;br /&gt;
   enterpriseval varchar(255),&lt;br /&gt;
   acquirorstatus varchar(150)&lt;br /&gt;
 );&lt;br /&gt;
 \COPY mas FROM 'MAUSTargetComp100pc1985-July2018-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --119432&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading in this data.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE longdescription;&lt;br /&gt;
 CREATE TABLE longdescription(&lt;br /&gt;
   varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   fundingdate date, --date co received first inv&lt;br /&gt;
   codescription varchar(10000) --long description&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY longdescription FROM 'PortCoLongDesc-Ready-normal-fixed.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --48037&lt;br /&gt;
&lt;br /&gt;
I encountered no problems loading this data.&lt;br /&gt;
&lt;br /&gt;
==Cleaning Companybase, Fundbase, Firmbase, and BranchOffice==&lt;br /&gt;
===Cleaning Company===&lt;br /&gt;
The primary key for port cos will be coname, datefirstinv, and statecode. Before checking whether this is a valid primary key, remove the undisclosed companies. I will explain the second part of the query concerning New York Digital Health later. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE companybasecore;&lt;br /&gt;
 CREATE TABLE companybasecore AS&lt;br /&gt;
 SELECT * &lt;br /&gt;
 FROM Companybase WHERE nationcode = 'US' AND coname != 'Undisclosed Company' &lt;br /&gt;
 AND NOT (coname='New York Digital Health LLC' AND statecode='NY' AND datefirstinv='2015-08-13' AND updateddate='2015-10-20');&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT coname, statecode, datefirstinv FROM companybasecore) AS T;&lt;br /&gt;
 --48001&lt;br /&gt;
Since the count of the table and the count of the distinct primary key is equivalent, you know that the primary key is valid. In the initial cleaning of the table, I first sorted out only the undisclosed companies. This table had 48002 rows. I then ran the DISTINCT query above and found that there are 48001 distinct rows with the coname, datefirstinv, statecode primary key. Thus there must two rows that share a primary key. I found this key using the following query:&lt;br /&gt;
&lt;br /&gt;
 SELECT * FROM (SELECT coname, datefirstinv, statecode FROM companybase) as key GROUP BY coname, datefirstinv, statecode HAVING COUNT(key) &amp;gt; 1;&lt;br /&gt;
&lt;br /&gt;
The company named 'New York Digital Health LLC' came up as the company that is causing the problems. I queried to find the two rows that list this company name in companybase and chose to keep the row that had the earlier updated date. It is a good practice to avoid deleting rows from tables when possible, so I added the filter as a WHERE clause to exclude one of the New York Digital listings.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Fundbase===&lt;br /&gt;
The primary key for funds will be only the fundname. First get rid of all of the undisclosed funds. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasenound;&lt;br /&gt;
 CREATE TABLE fundbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM fundbase WHERE fundname NOT LIKE '%Undisclosed Fund%';&lt;br /&gt;
 --28886&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT fundname FROM fundbasenound)a;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
As you can see, fundbase still has rows that share fundnames. If you are wondering why the DISTINCT in the first query did not eliminate these, it is because this DISTINCT applies to the whole row not individual fundnames. Thus, only completely duplicate rows will be eliminated in the first query. I chose to keep the funds that have the earlier last investment date. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundups;&lt;br /&gt;
 CREATE TABLE fundups AS SELECT&lt;br /&gt;
 fundname, max(lastinvdate) AS lastinvdate FROM fundbasenound GROUP BY fundname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --53&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundbasecore;&lt;br /&gt;
 CREATE TABLE fundbasecore AS&lt;br /&gt;
 SELECT A.* FROM fundbasenound AS A LEFT JOIN fundups AS B ON A.fundname=B.fundname AND A.lastinvdate=B.lastinvdate WHERE B.fundname IS NULL AND B.lastinvdate IS NULL;&lt;br /&gt;
 --28833&lt;br /&gt;
&lt;br /&gt;
Since the count of fundbasecore is the same as the number of distinct fund names, we know that the fundbasecore table is clean. In the first query I am finding duplicate rows and choosing the row that has the greater last investment date. I then match this table back to fundbasenound but choose all the rows from fundbasecore for which there is no corresponding fund in fundups based on fund name and date of last investment. This allows the funds with the earlier date of last investment to be chosen.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Firmbase===&lt;br /&gt;
The primary key for firms will be firm name. First I got rid of all undisclosed firms. I also filtered out two firms that have identical firm names and founding dates. The reason for this is because I use founding dates to filter out duplicate firm names. If there are two rows that have the same firm name and founding date, they will not be filtered out by the third query below. Thus, I chose to filter those out completely.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasenound;&lt;br /&gt;
 CREATE TABLE firmbasenound AS &lt;br /&gt;
 SELECT DISTINCT * FROM firmbase WHERE firmname NOT LIKE '%Undisclosed Firm%' AND firmname NOT LIKE '%Amundi%' AND firmname NOT LIKE '%Schroder Adveq Management%';&lt;br /&gt;
 --15452&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM firmbasenound)a;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since these counts are not equal we will have to clean the table further. We will use the same method from before.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmdups;&lt;br /&gt;
 CREATE TABLE firmdups AS SELECT&lt;br /&gt;
 firmname, max(foundingdate) as foundingdate FROM firmbasenound GROUP BY firmname HAVING COUNT(*)&amp;gt;1;&lt;br /&gt;
 --15&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbasecore;&lt;br /&gt;
 CREATE TABLE firmbasecore AS&lt;br /&gt;
 SELECT A.* FROM firmbasenound AS A LEFT JOIN firmdups AS B ON A.firmname=B.firmname AND A.foundingdate=B.foundingdate WHERE B.firmname IS NULL AND B.foundingdate IS NULL;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Since the count of firmbasecore and the DISTINCT query are the same, the firm table is now clean.&lt;br /&gt;
&lt;br /&gt;
===Cleaning Branch Offices===&lt;br /&gt;
When cleaning the branch offices, I had to remove all duplicates in the table. This is because the table is so sparse that often the only data in a row would be the fund name the branch was associated with. Thus, I couldn't filter based on dates as I had been doing previously for firms and funds. The primary key is firm name.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bonound;&lt;br /&gt;
 CREATE TABLE bonound AS&lt;br /&gt;
 SELECT *, CASE WHEN firmname LIKE '%Undisclosed Firm%' THEN 1::int ELSE 0::int END AS undisclosedflag&lt;br /&gt;
 FROM branchoffices;&lt;br /&gt;
 --10353&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM bonound)a;&lt;br /&gt;
 --10042&lt;br /&gt;
&lt;br /&gt;
Since these counts aren't the same, we will have to work a little more to clean the table. As stated above, I did this by excluding the firm names that were duplicated.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE branchofficecore;&lt;br /&gt;
 CREATE TABLE branchofficecore AS&lt;br /&gt;
 SELECT A.* FROM bonound AS A JOIN (&lt;br /&gt;
 		SELECT bonound.firmname, COUNT(*) FROM bonound GROUP BY firmname&lt;br /&gt;
 		HAVING COUNT(*) =1&lt;br /&gt;
 		) AS B&lt;br /&gt;
 ON A.firmname=B.firmname WHERE undisclosedflag=0;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM (SELECT DISTINCT firmname FROM branchofficecore)a;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Since these counts are the same, we are good to go. The count is 10 lower because we completely removed 10 firmnames from the listing by throwing out the duplicates.&lt;br /&gt;
&lt;br /&gt;
==Instructions on Matching PortCos to Issuers and M&amp;amp;As From Ed==&lt;br /&gt;
===Company Standardizing===&lt;br /&gt;
&lt;br /&gt;
Get portco keys&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeys;&lt;br /&gt;
 CREATE TABLE portcokey AS&lt;br /&gt;
 SELECT coname, statecode, datefirst&lt;br /&gt;
 FROM portcocore;&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcocore OR THESE KEYS ARE VALID AND FIX THAT FIRST&lt;br /&gt;
&lt;br /&gt;
Get distinct coname and put it in a file&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT coname FROM portcokeys) TO 'DistinctConame.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
&lt;br /&gt;
Match that to itself&lt;br /&gt;
 Move DistinctConame.txt to E:\McNair\Software\Scripts\Matcher\Input&lt;br /&gt;
 Open powershell and change directory to E:\McNair\Software\Scripts\Matcher&lt;br /&gt;
 Run the matcher in mode2:&lt;br /&gt;
  perl Matcher.pl -file1=&amp;quot;DistinctConame.txt&amp;quot; -file2=&amp;quot;DistinctConame.txt&amp;quot; -mode=2&lt;br /&gt;
 Pick up the output file from E:\McNair\Software\Scripts\Matcher\Output (it is probably called DistinctConame.txt-DistinctConame.txt.matched) and move it to your Z drive directory&lt;br /&gt;
 &lt;br /&gt;
Load the matches into the dbase&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortcoStd;&lt;br /&gt;
 CREATE TABLE PortcoStd (&lt;br /&gt;
    conamestd  varchar(255),&lt;br /&gt;
    coname   varchar(255),&lt;br /&gt;
    norm  varchar(100),&lt;br /&gt;
    x1  varchar(255),&lt;br /&gt;
    x2  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY CohortCoStd FROM 'DistinctConame.txt-DistinctConame.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --YOUR COUNT&lt;br /&gt;
 &lt;br /&gt;
Join the Conamestd back to the portcokeys table to create your matching table&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcokeysstd;&lt;br /&gt;
 CREATE TABLE portcokeysstd AS&lt;br /&gt;
 SELECT B.conamestd, A.*&lt;br /&gt;
 FROM portcokey AS A&lt;br /&gt;
 JOIN PortcoStd AS B ON A.coname=B.coname&lt;br /&gt;
 --CHECK COUNT IS SAME AS portcokey OR YOU LOST SOME NAMES OR INFLATED THE DATA&lt;br /&gt;
 &lt;br /&gt;
Put that in a file for matching (conamestd is in first column by construction)&lt;br /&gt;
&lt;br /&gt;
  \COPY portcokeysstd TO 'PortCoMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --YOUR COUNT&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===MA Cleaning and Matching===&lt;br /&gt;
First remove all of the duplicates in the MA data. Do this by running aggregate queries on every column except for the primary key:&lt;br /&gt;
 DROP TABLE MANoDups;&lt;br /&gt;
 CREATE TABLE MANoDups AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate, min(effectivedate) AS effectivedate, MIN(acquirorname) as acquirorname, MIN(acquirorstate) as acquirorstate, MAX(transactionamt) as &lt;br /&gt;
 transactionamt, MAX(enterpriseval) as enterpriseval, MIN(acquirorstatus) as acquirorstatus&lt;br /&gt;
 FROM mas &lt;br /&gt;
 GROUP BY targetname, targetstate, announceddate ORDER BY targetname, targetstate, announceddate;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM manodups)a;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
Since these counts are equivalent, the data set is clean. Then get all the primary keys from the table and copy the distinct target names into a text file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makey;&lt;br /&gt;
 CREATE TABLE makey AS&lt;br /&gt;
 SELECT targetname, targetstate, announceddate&lt;br /&gt;
 FROM manodups;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT targetname FROM makey) TO 'DistinctTargetName.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV;&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
After running this list of distinct target names through the matcher, put the standardized MA list into the data base.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MaStd;&lt;br /&gt;
 CREATE TABLE MaStd (&lt;br /&gt;
   targetnamestd varchar(255),&lt;br /&gt;
   targetname varchar(255),&lt;br /&gt;
   norm varchar(100),&lt;br /&gt;
   x1 varchar(255),&lt;br /&gt;
   x2 varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY mastd FROM 'DistinctTargetName.txt-DistinctTargetName.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --117212&lt;br /&gt;
&lt;br /&gt;
Then match the list of standardized names back to the makey table to get a table with standardized keys and primary keys. This will be your input for matching against port cos. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE makeysstd;&lt;br /&gt;
 CREATE TABLE makeysstd AS&lt;br /&gt;
 SELECT B.targetnamestd, A.*&lt;br /&gt;
 FROM makey AS A&lt;br /&gt;
 JOIN mastd AS B ON A.targetname=B.targetname;&lt;br /&gt;
 --119374&lt;br /&gt;
&lt;br /&gt;
  \COPY makeysstd TO 'MAMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
  --119374&lt;br /&gt;
&lt;br /&gt;
Use this text file to match against the PortCoMatchInput. Your job will be to determine whether the matches between the MAs and PortCos are true matches. The techniques that I used are described in the section below.&lt;br /&gt;
&lt;br /&gt;
===IPO Cleaning and Matching===&lt;br /&gt;
The process is the same for IPOs.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE iponodups;&lt;br /&gt;
 CREATE TABLE iponodups&lt;br /&gt;
 AS SELECT issuer, statecode, issuedate, MAX(principalamt) AS principalamt, MAX(proceedsamt) AS proceedsamt, MIN(naiccode) as naicode, MIN(zipcode) AS zipcode, MIN(status) AS status, &lt;br /&gt;
 MIN(foundeddate) AS foundeddate&lt;br /&gt;
 FROM ipos GROUP BY issuer, statecode, issuedate ORDER BY issuer, statecode, issuedate; &lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuer, statecode, issuedate FROM iponodups)a;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeys;&lt;br /&gt;
 CREATE TABLE ipokeys AS&lt;br /&gt;
 SELECT issuer, statecode, issuedate&lt;br /&gt;
 FROM iponodups;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT DISTINCT issuer FROM ipokeys) TO 'IPODistinctIssuer.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipokeysstd;&lt;br /&gt;
 CREATE TABLE ipokeysstd (&lt;br /&gt;
    issuerstd varchar(255),&lt;br /&gt;
    issuer varchar(255),&lt;br /&gt;
    norm varchar(100),&lt;br /&gt;
    x1 varchar(255),&lt;br /&gt;
    x2 varchar(255)&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY ipokeysstd FROM 'IPODistinctIssuer.txt-IPODistinctIssuer.txt.matched' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --10803&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ipostd;&lt;br /&gt;
 CREATE TABLE ipostd AS&lt;br /&gt;
 SELECT B.issuerstd, A.*&lt;br /&gt;
 FROM ipokeys AS A&lt;br /&gt;
 JOIN ipokeysstd AS B ON A.issuer=B.issuer;&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
 \COPY ipostd TO 'IPOMatchInput.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --11149&lt;br /&gt;
&lt;br /&gt;
As with MA, match this file against the PortCoMatchInput file without mode 2. Then manually check the matches using the techniques described below.&lt;br /&gt;
&lt;br /&gt;
I generally use MAX for amounts and MIN for dates. I also chose to use MIN on text strings.&lt;br /&gt;
&lt;br /&gt;
==Cleaning IPO and MA Data==&lt;br /&gt;
It is important to follow Ed's direction of cleaning the data using aggregate function before putting the data into excel. This will keep you from a lot of manual checking that is unnecessary. When ready, paste the data you have into an excel file. In that excel file, I made three columns: one to check whether state codes were equivalent, one checking whether the date of first investment was 3 years before the MA or IPO, and one checking whether both of these conditions were satisfied for each company. I did this using simple if statements. This process is manual checking and filtering to see whether matches are correct or not and are thus extremely subjective and tedious. First, I went through and checked the companies that did not have equivalent state codes. If the company was one that I knew or the name was unique to the point that I did not believe the same name would appear in another state, I marked the state codes as equivalent. I did the same for the date of first investment vs MA/IPO date. Then I removed all duplicates that had the marking Warning Multiple Matches, and the data sheets were clean.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Process For Creating the PortCoExits Table==&lt;br /&gt;
===MA Process===&lt;br /&gt;
First we must load the clean, manually checked tables back into the database. &lt;br /&gt;
 DROP TABLE MAClean;&lt;br /&gt;
 CREATE TABLE MAClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  targetnamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  targetname varchar(255),&lt;br /&gt;
  targetstate varchar(10),&lt;br /&gt;
  announceddate date&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY MAClean FROM 'MAClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --7205&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM MAClean)a;&lt;br /&gt;
 --7188&lt;br /&gt;
&lt;br /&gt;
As you can see there are still duplicate primary keys in the table. To get rid of these I wrote a query that chooses primary keys that occur only once and matches them against MANoDups. That way you will have unique primary keys by construction.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MACleanNoDups;&lt;br /&gt;
 CREATE TABLE MACleanNoDups AS&lt;br /&gt;
 SELECT A.*, effectivedate, transactionamt, enterpriseval, acquirorstatus&lt;br /&gt;
 FROM MAClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT targetname, targetstate, announceddate, COUNT(*) FROM MAClean&lt;br /&gt;
 	GROUP BY targetname, targetstate, announceddate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.targetname=B.targetname AND A.targetstate=B.targetstate AND A.announceddate=B.announceddate&lt;br /&gt;
 LEFT JOIN MANoDups AS C ON A.targetnamestd=C.targetname AND A.announceddate=C.announceddate;&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM MACleanNoDups)a;&lt;br /&gt;
 --7171&lt;br /&gt;
&lt;br /&gt;
Thus the portco primary key is unique in the table. We will use this later. &lt;br /&gt;
Now do the same for the IPOs.&lt;br /&gt;
&lt;br /&gt;
===IPO Process===&lt;br /&gt;
 DROP TABLE IPOClean;&lt;br /&gt;
 CREATE TABLE IPOClean (&lt;br /&gt;
  conamestd varchar(255),&lt;br /&gt;
  issuernamestd varchar(255),&lt;br /&gt;
  method varchar(100),&lt;br /&gt;
  x1 varchar(255),&lt;br /&gt;
  coname varchar(255),&lt;br /&gt;
  statecode varchar(10),&lt;br /&gt;
  datefirstinv date,&lt;br /&gt;
  x2 varchar(255),&lt;br /&gt;
  issuername varchar(255),&lt;br /&gt;
  issuerstate varchar(10),&lt;br /&gt;
  issuedate date&lt;br /&gt;
 );&lt;br /&gt;
 &lt;br /&gt;
 \COPY IPOClean FROM 'IPOClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2146&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT issuername, issuerstate, issuedate FROM IPOClean)a;&lt;br /&gt;
 --2141&lt;br /&gt;
&lt;br /&gt;
As with the MA process, there were duplicates in the clean IPO table. Get rid of these using the same process as with MAs. Only choose the primary keys that occur once and join these to the IPONoDups table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOCleanNoDups;&lt;br /&gt;
 CREATE TABLE IPOCleanNoDups AS&lt;br /&gt;
 SELECT A.*, principalamt, proceedsamt, naicode as naics, zipcode, status, foundeddate&lt;br /&gt;
 FROM IPOClean AS A&lt;br /&gt;
 JOIN (&lt;br /&gt;
 	SELECT issuername, issuerstate, issuedate, COUNT(*) FROM IPOClean&lt;br /&gt;
 	GROUP BY issuername, issuerstate, issuedate HAVING COUNT(*)=1&lt;br /&gt;
 	) AS B&lt;br /&gt;
 ON A.issuername=B.issuername AND A.issuerstate=B.issuerstate AND A.issuedate=B.issuedate&lt;br /&gt;
 LEFT JOIN IPONoDups AS C ON A.issuername=C.issuer AND A.issuerstate=C.statecode AND A.issuedate=C.issuedate;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(*) FROM(SELECT DISTINCT coname, statecode, datefirstinv FROM IPOCleanNoDups)a;&lt;br /&gt;
 --2136&lt;br /&gt;
&lt;br /&gt;
Now the duplicates are out of the MAClean and IPOClean data and we can start to construct the ExitKeysClean table.&lt;br /&gt;
&lt;br /&gt;
==Creating ExitKeysClean==&lt;br /&gt;
&lt;br /&gt;
First I looked for the PortCos that were in both the MAs and the IPOs. I did this using:&lt;br /&gt;
 DROP TABLE IPOMAForReview;&lt;br /&gt;
 CREATE TABLE IPOMAForReview&lt;br /&gt;
 SELECT A.*, B.targetname, B.targetstate, B.announcedate&lt;br /&gt;
 FROM IPOCleanNoDups AS A&lt;br /&gt;
 JOIN MACleanNoDups AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv;&lt;br /&gt;
 --92&lt;br /&gt;
&lt;br /&gt;
I then pulled out the IPOs that were only IPOs and MAs that were only MAs. I also added in a column that indicated whether a company underwent an IPO or a MA.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPONoConflict;&lt;br /&gt;
 CREATE TABLE IPONoConflict AS&lt;br /&gt;
 SELECT A.*, 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOCleanNoDups AS A &lt;br /&gt;
 LEFT JOIN MACleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv &lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --2044&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MANoConflict;&lt;br /&gt;
 CREATE TABLE MANoConflict AS&lt;br /&gt;
 SELECT A.*, 0::int as IPOvsMA&lt;br /&gt;
 FROM MACleanNoDups AS A&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS B &lt;br /&gt;
 ON A.coname=B.Coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.statecode IS NULL AND B.coname IS NULL AND B.datefirstinv IS NULL;&lt;br /&gt;
 --7079&lt;br /&gt;
&lt;br /&gt;
Since 2136-92=2044 and 7171-92=7079, we know that the duplicate companies were extracted successfully.&lt;br /&gt;
&lt;br /&gt;
I then wrote a query to check whether the IPO issue date or announced date of the MA was earlier and used that to indicate whether I chose the company to have undergone an MA or an IPO in the column MSvsIPO. A 0 in the column represented an MA being chosen and a 1 represented an IPO being chosen.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Then out of this table I extracted the MAs and IPOs using the the created MAvsIPO flag:&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE MASelected;&lt;br /&gt;
 CREATE TABLE MASelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 targetname, targetstate, announceddate,&lt;br /&gt;
 0::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;gt;= announceddate;&lt;br /&gt;
 --25&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE IPOSelected;&lt;br /&gt;
 CREATE TABLE IPOSelected AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, &lt;br /&gt;
 issuername, issuerstate, issuedate,&lt;br /&gt;
 1::int as IPOvsMA&lt;br /&gt;
 FROM IPOMAForReview &lt;br /&gt;
 WHERE issuedate &amp;lt; announceddate;&lt;br /&gt;
 --67&lt;br /&gt;
&lt;br /&gt;
I then made the ExitKeysClean table using the portco primary key and the indicator MAvsIPO indicator column.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE ExitKeys;&lt;br /&gt;
 CREATE TABLE ExitKeys AS&lt;br /&gt;
 SELECT coname, statecode, datefirstinv, ipovsma FROM IPONoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM IPOSelected&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MANoConflict&lt;br /&gt;
 UNION SELECT coname, statecode, datefirstinv, ipovsma FROM MASelected;&lt;br /&gt;
 --9215&lt;br /&gt;
&lt;br /&gt;
==Create the PortCoExit And PortCoAliveDead Tables==&lt;br /&gt;
From consulting with Ed and the VC Database Rebuild wiki, I decided to make the PortCoExit table with an mavsipo, an exitdate, an exited, and an exitvalue column. I use the MAvsIPO column to add in data. It is very important that you have constructed this column.&lt;br /&gt;
 DROP TABLE PortCoExit;&lt;br /&gt;
 CREATE TABLE PortCoExit AS&lt;br /&gt;
 SELECT A.coname, A.statecode, A.datefirstinv, A.datelastinv, A.city, B.ipovsma,&lt;br /&gt;
 CASE WHEN B.ipovsma IS NOT NULL THEN 1::int ELSE 0::int END AS Exit,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.proceedsamt::numeric WHEN ipovsma=0 THEN D.transactionamt::numeric ELSE NULL::numeric END AS exitvaluem,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN C.issuedate WHEN ipovsma=0 THEN D.announceddate ELSE NULL::date END AS exitdate,&lt;br /&gt;
 CASE WHEN B.ipovsma=1 THEN extract(year from C.issuedate) WHEN ipovsma=0 THEN extract(year from D.announceddate) ELSE NULL::int END AS exityear&lt;br /&gt;
 FROM companybasecore AS A&lt;br /&gt;
 LEFT JOIN ExitKeys AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 LEFT JOIN IPOCleanNoDups AS C ON A.coname=C.coname AND A.statecode=C.statecode AND A.datefirstinv=C.datefirstinv&lt;br /&gt;
 LEFT JOIN MACleanNoDups AS D ON A.coname=D.coname AND A.statecode=D.statecode AND A.datefirstinv=D.datefirstinv;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
I then used this table to build one that has information as to whether a company was dead or alive. I found this information by checking whether a company had undergone an IPO or MA, indicating the company was dead. Alternatively, if the company's date of last investment was more than 5 years ago, I marked the company as dead.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE PortCoAliveDead;&lt;br /&gt;
 CREATE TABLE PortCoAliveDead AS&lt;br /&gt;
 SELECT *, &lt;br /&gt;
 datefirstinv as alivedate, extract(year from datefirstinv) as aliveyear,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exitdate &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN (datelastinv + INTERVAL '5 year') &lt;br /&gt;
 	ELSE NULL::date END AS deaddate,&lt;br /&gt;
 CASE WHEN exitdate IS NOT NULL then exityear &lt;br /&gt;
 	WHEN exitdate IS NULL AND (datelastinv + INTERVAL '5 year') &amp;lt; '7/1/2018' THEN extract(year from (datelastinv + INTERVAL '5 year')) &lt;br /&gt;
 	ELSE NULL::int END AS deadyear&lt;br /&gt;
 FROM PortCoExit;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
==GeoCoding Companies, Firms, and Branch Offices==&lt;br /&gt;
A helpful page here is the [[Geocode.py]] page which explains how to use the Geocoding script. You will have to tweak the Geocode script when geocoding as each of these tables has a different primary key. It is vital that you include the primary keys in the file you input and output from the Geocoding script. Without these, you will not be able to join the latitudes and longitudes back to the firm, branch office, or company base tables.&lt;br /&gt;
&lt;br /&gt;
Geocoding costs money since we are using the Google Maps API. The process doesn't cost much, but in order to save money I tried to salvage as much of the preexisting geocode information I could find.&lt;br /&gt;
===Companies===&lt;br /&gt;
I found the table of old companies with latitudes and longitudes in vcdb2 and loaded these into vcdb3.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldgeocords;&lt;br /&gt;
 CREATE TABLE oldgeocords (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   ivestedk real,&lt;br /&gt;
   city varchar(255),&lt;br /&gt;
   addr1 varchar(255),&lt;br /&gt;
   addr2 varchar(100),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY oldgeocords FROM 'companybasegeomaster.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
The API occasionally will give erroneous latitude and longitude readings. In order to catch only the good ones, I found the latitude and longitude lines that encompass the mainland US and created an exclude flag to make sure companies were in this box. I then created flags to include companies in Puerto Rico, Hawaii, and Alaska. Companies that were in these places often had wrong latitude and longitude readings of 44.93, 7.54, so I ran a query making sure that these weren't listed. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords;&lt;br /&gt;
 CREATE TABLE geoallcoords AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldgeocords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords1;&lt;br /&gt;
 CREATE TABLE geoallcoords1 AS SELECT&lt;br /&gt;
 *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords;&lt;br /&gt;
 --44740&lt;br /&gt;
&lt;br /&gt;
I then included only companies that were either in the mainland US, Hawaii, Alaska, or Puerto Rico. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeoold;&lt;br /&gt;
 CREATE TABLE goodgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords1 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --38498&lt;br /&gt;
&lt;br /&gt;
I then found the remaining companies that needed to be geocoded. Only companies that have addresses listed are able to be accurately geocoded. If we attempt to geocode based on city, the location returned will simply be the center of the city. Thus, I chose the companies that we did not already have listings for and had a valid address.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaininggeo;&lt;br /&gt;
 CREATE TABLE remaininggeo AS SELECT A.coname, A.statecode, A.datefirstinv, A.addr1, A.addr2, A.city, A.zip FROM companybasecore AS A LEFT JOIN goodgeoold AS B ON A.coname=B.coname &lt;br /&gt;
 AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv&lt;br /&gt;
 WHERE B.coname IS NULL AND A.addr1 IS NOT NULL;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 \COPY remaininggeo TO 'RemainingGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I copied this table into excel to concatenate the address, city, state, and zipcode columns into one column. This can and should be done in SQL, but I was not aware this could be done. I then ran remaininggeo through the Geocode script with columns coname, statecode, datefirstinv, and address in the inputted file.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remaining;&lt;br /&gt;
 CREATE TABLE remaining (&lt;br /&gt;
   coname varchar(255),&lt;br /&gt;
   statecode varchar(10),&lt;br /&gt;
   datefirstinv date,&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY remaining FROM 'RemainingLatLong.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
I then ran the same geographical checks on the newly geocoded companies and found all of the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords2;&lt;br /&gt;
 CREATE TABLE geoallcoords2 AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM remaining;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geoallcoords3;&lt;br /&gt;
 CREATE TABLE geoallcoords3 AS&lt;br /&gt;
 SELECT *, CASE WHEN statecode='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int     0::int END as prflag,&lt;br /&gt;
 CASE WHEN statecode='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN statecode='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM geoallcoords2;&lt;br /&gt;
 --5955&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodgeonew;&lt;br /&gt;
 CREATE TABLE goodgeonew AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM companybasecore AS A LEFT JOIN geoallcoords3 AS B ON&lt;br /&gt;
 A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5913&lt;br /&gt;
&lt;br /&gt;
I then combined the old and new geocodes and matched them back to the company base table to get a geo table for companies.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE geocodesportco;&lt;br /&gt;
 CREATE TABLE geocodesportco AS SELECT&lt;br /&gt;
 A.* from goodgeonew &lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT B.* from goodgeoold;&lt;br /&gt;
 --44411&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE portcogeo;&lt;br /&gt;
 CREATE TABLE portcogeo AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude FROM companybasecore AS A LEFT JOIN Geocodesportco AS B ON A.coname=B.coname AND A.datefirstinv=B.datefirstinv AND A.statecode=B.statecode;&lt;br /&gt;
 --48001&lt;br /&gt;
&lt;br /&gt;
===Firms===&lt;br /&gt;
This process is largely the same as for companies. I found old firms that had already been geocoded and checked for accuracy.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE oldfirmcoords;&lt;br /&gt;
 CREATE TABLE oldfirmcoords (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
 &lt;br /&gt;
 \COPY oldfirmcoords FROM 'FirmCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmoldfilter;&lt;br /&gt;
 CREATE TABLE firmoldfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM oldfirmcoords;&lt;br /&gt;
 --5556&lt;br /&gt;
&lt;br /&gt;
Since oldfirmcoords does not have state codes, we have to find a way to include state codes to add in companies based in Puerto Rico, Hawaii, and Alaska. I did this by matching the firmoldfilter table back to the firm base table. I used the coalesce function because we wanted to exclude companies that we had not geocoded due to faulty addresses. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch1;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch1 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmbasecore AS A LEFT JOIN firmoldfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
Then the process of tagging the PR, HI, and AK companies and including only correctly tagged companies is the same as for companies. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternal;&lt;br /&gt;
 CREATE TABLE firmcoordsexternal AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch1;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeoold;&lt;br /&gt;
 CREATE TABLE goodfirmgeoold AS SELECT&lt;br /&gt;
 A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternal AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --5346&lt;br /&gt;
&lt;br /&gt;
Find the remaining firms and run the geocode script on these firms&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE remainingfirm;&lt;br /&gt;
 CREATE TABLE remainingfirm AS SELECT A.firmname, A.addr1, A.addr2, A.city, A.state, A.zip FROM firmcoreonedupremoved AS A LEFT JOIN goodfirmgeoold AS B ON A.firmname=B.firmname&lt;br /&gt;
 WHERE B.firmname IS NULL AND A.addr1 IS NOT NULL AND A.msacode!='9999';&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 \COPY remainingfirm TO 'FirmGeoRemaining.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmremainingcoords;&lt;br /&gt;
 CREATE TABLE firmremainingcoords(&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY firmremainingcoords FROM 'FirmRemainingCoords.txt' DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
Follow the same filtering process as above to get the good geocodes. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmnewfilter;&lt;br /&gt;
 CREATE TABLE firmnewfilter AS&lt;br /&gt;
 SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM firmremainingcoords;&lt;br /&gt;
 --706&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsmatch2;&lt;br /&gt;
 CREATE TABLE firmcoordsmatch2 AS SELECT &lt;br /&gt;
 A.firmname, A.state, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM firmcoreonedupremoved AS A LEFT JOIN firmnewfilter AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmcoordsexternalremaining;&lt;br /&gt;
 CREATE TABLE firmcoordsexternalremaining AS&lt;br /&gt;
 SELECT *, CASE WHEN state='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN state='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN state='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM firmcoordsmatch2;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE goodfirmgeonew;&lt;br /&gt;
 CREATE TABLE goodfirmgeonew AS SELECT A.*, B.latitude, B.longitude, B.prflag, B.excludeflag, B.hiflag, B.akflag FROM firmcoreonedupremoved AS A LEFT JOIN firmcoordsexternalremaining AS B &lt;br /&gt;
 ON A.firmname=B.firmname&lt;br /&gt;
 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --703&lt;br /&gt;
&lt;br /&gt;
Combine the old and new geocoded firms and match them to firm base to get a firm geo table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocoords;&lt;br /&gt;
 CREATE TABLE firmgeocoords AS&lt;br /&gt;
 SELECT * FROM goodfirmgeonew&lt;br /&gt;
 UNION&lt;br /&gt;
 SELECT * FROM goodfirmgeoold;&lt;br /&gt;
 --6049&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmgeocore;&lt;br /&gt;
 CREATE TABLE firmgeocore AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude FROM firmbasecore AS A LEFT JOIN firmgeocoords AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
===Branch Offices===&lt;br /&gt;
I did not use old branch office data because I could not find it anywhere in the old data set. I have since found old data in the table firmbasecoords in vcdb2. &lt;br /&gt;
&lt;br /&gt;
First copy all of the needed data out of the database to do geocoding.&lt;br /&gt;
&lt;br /&gt;
 \COPY (SELECT A.firmname, A.boaddr1, A.boaddr2, A.bocity, A.bostate, A.bozip FROM bonound AS A WHERE A.boaddr1 IS NOT NULL) TO 'BranchOffices.txt' WITH DELIMITER AS E'\t' HEADER &lt;br /&gt;
 NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
Then load the data into the database and follow the same filtering process as above.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo;&lt;br /&gt;
 CREATE TABLE bogeo (&lt;br /&gt;
   firmname varchar(255),&lt;br /&gt;
   latitude numeric,&lt;br /&gt;
   longitude numeric&lt;br /&gt;
   );&lt;br /&gt;
&lt;br /&gt;
 \COPY bogeo FROM 'BranchOfficesGeo.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo1;&lt;br /&gt;
 CREATE TABLE bogeo1 AS SELECT *, CASE&lt;br /&gt;
 WHEN longitude &amp;lt; -125 OR longitude &amp;gt; -66 OR latitude &amp;lt; 24 OR latitude &amp;gt; 50 OR latitude IS NULL OR longitude IS NULL THEN 1::int ELSE &lt;br /&gt;
 0::int END AS excludeflag FROM bogeo;&lt;br /&gt;
 --2046&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bomatchgeo;&lt;br /&gt;
 CREATE TABLE bomatchgeo AS&lt;br /&gt;
 SELECT A.*, B.latitude, B.longitude, COALESCE(B.excludeflag, 1) AS excludeflag FROM branchofficecore AS A LEFT JOIN bogeo1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeo2;&lt;br /&gt;
 CREATE TABLE bogeo2 AS&lt;br /&gt;
 SELECT *, CASE WHEN bostate='PR' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as prflag,&lt;br /&gt;
 CASE WHEN bostate='HI' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int ELSE 0::int END as hiflag,&lt;br /&gt;
 CASE WHEN bostate='AK' AND latitude!=44.9331 AND longitude!=7.54012 THEN 1::int  ELSE 0::int END as akflag&lt;br /&gt;
 FROM bomatchgeo;&lt;br /&gt;
 --10032&lt;br /&gt;
&lt;br /&gt;
Match the correctly geocoded branch offices back to firm base to get the final table.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE bogeocore1;&lt;br /&gt;
 CREATE TABLE bogeocore1 AS&lt;br /&gt;
 SELECT * FROM bogeo2 WHERE excludeflag=0 or prflag=1 or hiflag=1 or akflag=1;&lt;br /&gt;
 --1161&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmbogeo;&lt;br /&gt;
 CREATE TABLE firmbogeo AS&lt;br /&gt;
 SELECT A.*, B.latitude AS BOLatitude, B.longitude AS BOLongitude FROM firmgeocore AS A LEFT JOIN bogeocore1 AS B ON A.firmname=B.firmname;&lt;br /&gt;
 --15437&lt;br /&gt;
&lt;br /&gt;
==Creating People Tables==&lt;br /&gt;
We pulled data on executives in both portcos and funds. I describe the process below. If any of the explanations don't make sense, I also describe most tables in the section called Marcos's Code.&lt;br /&gt;
===Company People===&lt;br /&gt;
 DROP TABLE titlelookup;&lt;br /&gt;
 CREATE TABLE titlelookup(&lt;br /&gt;
 	fulltitle varchar(150),&lt;br /&gt;
 	charman int, &lt;br /&gt;
 	ceo int,&lt;br /&gt;
 	cfo int,&lt;br /&gt;
 	coo int,&lt;br /&gt;
 	cio int,&lt;br /&gt;
 	cto int,&lt;br /&gt;
 	otherclvl int,&lt;br /&gt;
 	boardmember int,&lt;br /&gt;
 	president int,&lt;br /&gt;
 	vp int,&lt;br /&gt;
 	founder int,&lt;br /&gt;
 	director int&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY titlelookup FROM 'Important Titles in Women2017 dataset.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --628&lt;br /&gt;
&lt;br /&gt;
This table lists various titles one can have and identifies where they fall under traditional executive titles.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeople;&lt;br /&gt;
 CREATE TABLE copeople(&lt;br /&gt;
 	datefirstinv   date,&lt;br /&gt;
 	cname varchar(150),&lt;br /&gt;
 	statecode  varchar(2),&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	nonmanaging  varchar(1),&lt;br /&gt;
 	prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY copeople FROM 'Executives-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
This table gets various executives from portcos. This is loaded from SDC. Next we have to identify which traditional executive level job the listed job title corresponds to. It also identifies whether a prefix identifies an executive as male or female. I made a stupid mistake of writing cname instead of coname when loading in the data. If you want to save yourself work, write coname.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeoplebase;&lt;br /&gt;
 CREATE TABLE copeoplebase AS&lt;br /&gt;
 SELECT copeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson,&lt;br /&gt;
 CASE WHEN fulltitle IS NOT NULL THEN 1::int ELSE 0::int END AS hastitlelookup,&lt;br /&gt;
 CASE WHEN charman IS NOT NULL THEN charman ELSE 0::int END AS chairman,&lt;br /&gt;
 CASE WHEN ceo IS NOT NULL THEN ceo ELSE 0::int END AS ceo,&lt;br /&gt;
 CASE WHEN cfo IS NOT NULL THEN cfo ELSE 0::int END AS cfo,&lt;br /&gt;
 CASE WHEN coo IS NOT NULL THEN coo ELSE 0::int END AS coo,&lt;br /&gt;
 CASE WHEN cio IS NOT NULL THEN cio ELSE 0::int END AS cio,&lt;br /&gt;
 CASE WHEN cto IS NOT NULL THEN cto ELSE 0::int END AS cto,&lt;br /&gt;
 CASE WHEN otherclvl IS NOT NULL THEN otherclvl ELSE 0::int END AS otherclvl,&lt;br /&gt;
 CASE WHEN boardmember IS NOT NULL THEN boardmember ELSE 0::int END AS boardmember,&lt;br /&gt;
 CASE WHEN president IS NOT NULL THEN president ELSE 0::int END AS president,&lt;br /&gt;
 CASE WHEN vp IS NOT NULL THEN vp ELSE 0::int END AS vp,&lt;br /&gt;
 CASE WHEN founder IS NOT NULL THEN founder ELSE 0::int END AS founder,&lt;br /&gt;
 CASE WHEN director IS NOT NULL THEN director ELSE 0::int END AS director&lt;br /&gt;
 FROM copeople&lt;br /&gt;
 LEFT JOIN titlelookup ON copeople.jobtitle=titlelookup.fulltitle;&lt;br /&gt;
 --194359&lt;br /&gt;
&lt;br /&gt;
Next we will try to identify whether an executive is male or female based on their names.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE namegender;&lt;br /&gt;
 CREATE TABLE namegender AS&lt;br /&gt;
 SELECT firstname, &lt;br /&gt;
 CASE WHEN countfemale &amp;gt; 0 AND countmale=0 THEN 1::int ELSE 0::int END AS exclusivelyfemale,&lt;br /&gt;
 CASE WHEN countmale &amp;gt; 0 AND countfemale=0 THEN 1::int ELSE 0::int END AS exclusivelymale&lt;br /&gt;
 FROM&lt;br /&gt;
 	(SELECT firstname, COALESCE(sum(titlefemale),0) as countfemale,  COALESCE(sum(titlemale),0) as countmale &lt;br /&gt;
 	FROM copeoplebase WHERE doctor=0&lt;br /&gt;
 	GROUP BY firstname) As T&lt;br /&gt;
 WHERE NOT (countfemale &amp;gt; 0 AND countmale&amp;gt;0);&lt;br /&gt;
 --12736&lt;br /&gt;
&lt;br /&gt;
The next table expands CoPeopleBase to include information on executive gender and executive position.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleFull;&lt;br /&gt;
 CREATE TABLE CoPeopleFull AS&lt;br /&gt;
 SELECT copeoplebase.*,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1 THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender,&lt;br /&gt;
 CASE WHEN (ceo=1 OR president=1) THEN 1::int ELSE 0::int END AS ceopres,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1) THEN 1::int ELSE 0::int END AS CLevel,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1) THEN 1::int ELSE 0::int END AS board,&lt;br /&gt;
 CASE WHEN (chairman=1 OR ceo=1 OR cfo=1 OR coo=1 OR cio=1 OR cto=1 OR otherclvl=1 OR president=1 OR director=1 OR boardmember=1 OR vp=1 OR founder=1) THEN 1::int ELSE &lt;br /&gt;
 0::int END AS vpandabove&lt;br /&gt;
 FROM copeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=copeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --177547&lt;br /&gt;
&lt;br /&gt;
The next table only keeps executive listings that have a valid portco primary key associated with them. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE CoPeopleKey;&lt;br /&gt;
 CREATE TABLE CoPeopleKey AS&lt;br /&gt;
 SELECT A.*&lt;br /&gt;
 FROM CoPeopleFull AS A&lt;br /&gt;
 JOIN (SELECT firstname, lastname, cname, datefirstinv, statecode, count(*) FROM CoPeopleFull &lt;br /&gt;
 WHERE firstname IS NOT NULL AND lastname IS NOT NULL AND cname IS NOT NULL AND datefirstinv IS NOT NULL AND statecode IS NOT NULL&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode HAVING COUNT(*)=1) AS B&lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The next table identifies whether a person previously held executive positions.&lt;br /&gt;
&lt;br /&gt;
 CREATE TABLE CoPeopleSerial AS&lt;br /&gt;
 SELECT firstname, lastname, cname, datefirstinv, statecode, &lt;br /&gt;
 COALESCE(sum(hasperson),0) as prev,&lt;br /&gt;
 COALESCE(sum(ceo),0) as prevceo,&lt;br /&gt;
 COALESCE(sum(ceopres),0) as prevceopres,&lt;br /&gt;
 COALESCE(sum(founder),0) as prevfounder,&lt;br /&gt;
 COALESCE(sum(clevel),0) as prevclevel,&lt;br /&gt;
 COALESCE(sum(board),0) as prevboard,&lt;br /&gt;
 COALESCE(sum(vpandabove),0) as prevvpandabove,&lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serial,&lt;br /&gt;
 CASE WHEN sum(ceo) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceo,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS serialvpandabove&lt;br /&gt;
 FROM (&lt;br /&gt;
 	SELECT A.prefix, A.firstname, A.lastname, A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 	B.cname as prevcname, B.datefirstinv as prevdatefirstinv, B.statecode as prevstatecode, B.ceo, B.ceopres, B.founder, B.clevel, B.board, B.vpandabove, B.hasperson&lt;br /&gt;
 	FROM CoPeopleKey AS A&lt;br /&gt;
 	LEFT JOIN CoPeopleKey AS B ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv &amp;gt; B.datefirstinv&lt;br /&gt;
 ) AS T&lt;br /&gt;
 GROUP BY firstname, lastname, cname, datefirstinv, statecode;&lt;br /&gt;
 --176251&lt;br /&gt;
&lt;br /&gt;
The last table aggregates a ton of information on executives for each company. There is too much information to explain it all. &lt;br /&gt;
&lt;br /&gt;
 DROP TABLE copeopleagg;&lt;br /&gt;
 CREATE TABLE copeopleagg AS&lt;br /&gt;
 SELECT A.cname, A.datefirstinv, A.statecode, &lt;br /&gt;
 sum(hasperson) as numperson,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 CASE WHEN sum(ceopres) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasceopres,&lt;br /&gt;
 CASE WHEN sum(founder) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasfounder,&lt;br /&gt;
 CASE WHEN sum(clevel) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasclevel,&lt;br /&gt;
 CASE WHEN sum(board) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasboard,&lt;br /&gt;
 CASE WHEN sum(vpandabove) &amp;gt;=1 THEN 1::int ELSE 0::int END AS hasvpandabove,&lt;br /&gt;
 sum(female) as females,&lt;br /&gt;
 sum(male) as males,&lt;br /&gt;
 sum(unknowngender) as ugs,&lt;br /&gt;
 sum(doctor*female) as femaledoctors,&lt;br /&gt;
 sum(doctor*male) as maledoctors,&lt;br /&gt;
 sum(doctor*unknowngender) as ugdoctors,&lt;br /&gt;
 sum(ceopres*female) as femaleceos,&lt;br /&gt;
 sum(ceopres*male) as maleceos,&lt;br /&gt;
 sum(ceopres*unknowngender) as ugceos,&lt;br /&gt;
 sum(ceopres*female*doctor) as femaledoctorceos,&lt;br /&gt;
 sum(ceopres*male*doctor) as maledoctorceos,&lt;br /&gt;
 sum(ceopres*unknowngender*doctor) as ugdoctorceos,&lt;br /&gt;
 sum(founder*female) as femalefounders,&lt;br /&gt;
 sum(founder*male) as malefounders,&lt;br /&gt;
 sum(founder*unknowngender) as ugfounders,&lt;br /&gt;
 sum(founder*female*doctor) as femaledoctorfounders,&lt;br /&gt;
 sum(founder*male*doctor) as maledoctorfounders,&lt;br /&gt;
 sum(founder*unknowngender*doctor) as ugdoctorfounders,&lt;br /&gt;
 sum(clevel*female) as femaleclevels,&lt;br /&gt;
 sum(clevel*male) as maleclevels,&lt;br /&gt;
 sum(clevel*unknowngender) as ugclevels,&lt;br /&gt;
 sum(clevel*female*doctor) as femaledoctorclevels,&lt;br /&gt;
 sum(clevel*male*doctor) as maledoctorclevels,&lt;br /&gt;
 sum(clevel*unknowngender*doctor) as ugdoctorclevels,&lt;br /&gt;
 sum(board*female) as femaleboards,&lt;br /&gt;
 sum(board*male) as maleboards,&lt;br /&gt;
 sum(board*unknowngender) as ugboards,&lt;br /&gt;
 sum(board*female*doctor) as femaledoctorboards,&lt;br /&gt;
 sum(board*male*doctor) as maledoctorboards,&lt;br /&gt;
 sum(board*unknowngender*doctor) as ugdoctorboards,&lt;br /&gt;
 sum(vpandabove*female) as femaleabovevps,&lt;br /&gt;
 sum(vpandabove*male) as maleabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender) as ugabovevps,&lt;br /&gt;
 sum(vpandabove*female*doctor) as femaledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*male*doctor) as maledoctorabovevps,&lt;br /&gt;
 sum(vpandabove*unknowngender*doctor) as ugdoctorabovevps,&lt;br /&gt;
 sum(prev*female) as femaleprevs,&lt;br /&gt;
 sum(prev*male) as maleprevs,&lt;br /&gt;
 sum(prev*unknowngender) as ugprevs,&lt;br /&gt;
 sum(prevceopres*female) as femaleprevceopres,&lt;br /&gt;
 sum(prevceopres*male) as maleprevceopres,&lt;br /&gt;
 sum(prevceopres*unknowngender) as ugprevceopres,&lt;br /&gt;
 sum(prevfounder*female) as femaleprevfounder,&lt;br /&gt;
 sum(prevfounder*male) as maleprevfounder,&lt;br /&gt;
 sum(prevfounder*unknowngender) as ugprevfounder,&lt;br /&gt;
 sum(prevclevel*female) as femaleprevclevel,&lt;br /&gt;
 sum(prevclevel*male) as maleprevclevel,&lt;br /&gt;
 sum(prevclevel*unknowngender) as ugprevclevel,&lt;br /&gt;
 sum(prevboard*female) as femaleprevboard,&lt;br /&gt;
 sum(prevboard*male) as maleprevboard,&lt;br /&gt;
 sum(prevboard*unknowngender) as ugprevboard,&lt;br /&gt;
 sum(prevvpandabove*female) as femaleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*male) as maleprevvpandabove,&lt;br /&gt;
 sum(prevvpandabove*unknowngender) as ugprevvpandabove,&lt;br /&gt;
 sum(serial*female) as femaleserials,&lt;br /&gt;
 sum(serial*male) as maleserials,&lt;br /&gt;
 sum(serial*unknowngender) as ugserials,&lt;br /&gt;
 sum(serialceopres*female) as femaleserialceopres,&lt;br /&gt;
 sum(serialceopres*male) as maleserialceopres,&lt;br /&gt;
 sum(serialceopres*unknowngender) as ugserialceopres,&lt;br /&gt;
 sum(serialfounder*female) as femaleserialfounder,&lt;br /&gt;
 sum(serialfounder*male) as maleserialfounder,&lt;br /&gt;
 sum(serialfounder*unknowngender) as ugserialfounder,&lt;br /&gt;
 sum(serialclevel*female) as femaleserialclevel,&lt;br /&gt;
 sum(serialclevel*male) as maleserialclevel,&lt;br /&gt;
 sum(serialclevel*unknowngender) as ugserialclevel,&lt;br /&gt;
 sum(serialboard*female) as femaleserialboard,&lt;br /&gt;
 sum(serialboard*male) as maleserialboard,&lt;br /&gt;
 sum(serialboard*unknowngender) as ugserialboard,&lt;br /&gt;
 sum(serialvpandabove*female) as femaleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*male) as maleserialvpandabove,&lt;br /&gt;
 sum(serialvpandabove*unknowngender) as ugserialvpandabove,&lt;br /&gt;
 sum(ceopres*serialceopres*female) as femaleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*male) as maleceopresserialceopres,&lt;br /&gt;
 sum(ceopres*serialceopres*unknowngender) as ugceopresserialceopres,&lt;br /&gt;
 sum(founder*serialfounder*female) as femalefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*male) as malefounderserialfounder,&lt;br /&gt;
 sum(founder*serialfounder*unknowngender) as ugfounderserialfounder &lt;br /&gt;
 FROM CoPeoplekey AS A&lt;br /&gt;
 JOIN CoPeopleSerial AS B &lt;br /&gt;
 ON A.firstname=B.firstname AND A.lastname=B.lastname AND A.datefirstinv=B.datefirstinv AND A.cname=B.cname AND A.statecode=B.statecode&lt;br /&gt;
 GROUP BY A.cname, A.datefirstinv, A.statecode;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
Since this table is so big, it is a good idea to have a smaller, more manageable table to work with. &lt;br /&gt;
&lt;br /&gt;
DROP TABLE copeopleaggsimple;&lt;br /&gt;
 CREATE TABLE copeopleaggsimple AS&lt;br /&gt;
 SELECT cname, datefirstinv, statecode, numperson, females, males, ugs, ugdoctors, maleserials+femaleserials+ugserials AS serials&lt;br /&gt;
 FROM copeopleagg;&lt;br /&gt;
 --30413&lt;br /&gt;
&lt;br /&gt;
===Fund People===&lt;br /&gt;
Luckily, this process is much easier than the company people process. First we must simply load the data into the db.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeople;&lt;br /&gt;
 CREATE TABLE fundpeople(&lt;br /&gt;
 	fundname  varchar(255),&lt;br /&gt;
 	fundyear  int,&lt;br /&gt;
 	prefix varchar(5),&lt;br /&gt;
 	firstname varchar(50),&lt;br /&gt;
 	lastname varchar(50),&lt;br /&gt;
 	jobtitle varchar(150),&lt;br /&gt;
 	 prevpos  varchar(255)&lt;br /&gt;
 );&lt;br /&gt;
&lt;br /&gt;
 \COPY fundpeople FROM 'Executives-Funds-NoFoot-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table identifies degree and sex information about the executives of the fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE fundpeoplebase;&lt;br /&gt;
 CREATE TABLE fundpeoplebase AS&lt;br /&gt;
 SELECT fundpeople.*,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 1::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 0::int&lt;br /&gt;
 	ELSE Null::int END AS titlefemale,&lt;br /&gt;
 CASE WHEN prefix='Ms' THEN 0::int&lt;br /&gt;
 	WHEN prefix='Mr' THEN 1::int&lt;br /&gt;
 	ELSE Null::int END AS titlemale,&lt;br /&gt;
 CASE WHEN prefix='Dr' THEN 1::int&lt;br /&gt;
 	ELSE 0::int END AS doctor,&lt;br /&gt;
 CASE WHEN prefix IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hastitle,&lt;br /&gt;
 CASE WHEN prefix IS NULL AND firstname IS NULL AND lastname IS NULL THEN 0::int&lt;br /&gt;
 	ELSE 1::int END AS hasperson&lt;br /&gt;
 FROM fundpeople;&lt;br /&gt;
 --328994&lt;br /&gt;
&lt;br /&gt;
The next table tries to identify the sex of the executive using the above defined namegender table. It only selects rows where a person is actually listed.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleFull;&lt;br /&gt;
 CREATE TABLE FundPeopleFull AS&lt;br /&gt;
 SELECT fundpeoplebase.*, exclusivelyfemale, exclusivelymale,&lt;br /&gt;
 CASE WHEN titlefemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelyfemale=1 AND exclusivelymale=0 AND (titlemale=0 OR titlemale IS NULL) THEN 1::int ELSE 0::int END AS female,&lt;br /&gt;
 CASE WHEN titlemale=1 THEN 1::int &lt;br /&gt;
 	WHEN exclusivelymale=1  AND exclusivelyfemale=0 AND (titlefemale =0 OR titlefemale IS NULL) THEN 1::int ELSE 0::int END AS male,	&lt;br /&gt;
 CASE WHEN (titlefemale=1 OR titlemale=1 OR exclusivelymale=1 OR exclusivelyfemale=1) THEN 0::int ELSE 1::int END AS unknowngender&lt;br /&gt;
 FROM fundpeoplebase&lt;br /&gt;
 LEFT JOIN namegender ON namegender.firstname=fundpeoplebase.firstname&lt;br /&gt;
 WHERE hasperson=1;&lt;br /&gt;
 --320915&lt;br /&gt;
&lt;br /&gt;
The next table gives you information on executives aggregated by fund.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE FundPeopleAgg;&lt;br /&gt;
 CREATE TABLE FundPeopleAgg AS&lt;br /&gt;
 SELECT fundname, &lt;br /&gt;
 sum(female) as numfemale,&lt;br /&gt;
 sum(male) as nummale,&lt;br /&gt;
 sum(unknowngender) as numunknowngender,&lt;br /&gt;
 sum(doctor) as numdoctor,&lt;br /&gt;
 sum(female*doctor) as numfemaledoctor,&lt;br /&gt;
 sum(male*doctor) as nummaledoctor,&lt;br /&gt;
 sum(unknowngender*doctor) as numunknowngenderdoctor,&lt;br /&gt;
 sum(hastitle) as numtitled,&lt;br /&gt;
 sum(hasperson) as numpeople, &lt;br /&gt;
 CASE WHEN sum(hasperson) &amp;gt; 0 THEN sum(female)/sum(hasperson) ELSE NULL END as fracfemale,&lt;br /&gt;
 CASE WHEN sum(male) &amp;gt; 0 THEN sum(female)/sum(male) ELSE NULL END as ratiofemale&lt;br /&gt;
 FROM FundPeopleFull&lt;br /&gt;
 GROUP BY fundname;&lt;br /&gt;
 --21536&lt;br /&gt;
&lt;br /&gt;
It is also good to have this information on firms. We do not pull firm people information from SDC. However, we have enough information to create it from preexisting tables.&lt;br /&gt;
&lt;br /&gt;
 DROP TABLE firmpeopleagg;&lt;br /&gt;
 CREATE TABLE firmpeopleagg AS &lt;br /&gt;
 SELECT _firmname as firmname, sum(numfemale) as firmwomen, sum(nummale) as firmmen, sum(numunknowngender) as firmugs, &lt;br /&gt;
 sum(numdoctor) as firmdoctors, sum(numpeople) as firmpeople,&lt;br /&gt;
 CASE WHEN sum(numpeople) &amp;gt; 0 THEN (sum(numfemale)/sum(numpeople))::real ELSE NULL END as firmfracwomen,&lt;br /&gt;
 CASE WHEN sum(nummale) &amp;gt; 0 THEN (sum(numfemale)/sum(nummale))::real ELSE NULL END as firmratiowomen&lt;br /&gt;
 FROM roundlineaggfunds AS A&lt;br /&gt;
 JOIN fundpeopleagg AS B ON A._fundname=B.fundname&lt;br /&gt;
 GROUP BY _firmname;&lt;br /&gt;
 --5273&lt;br /&gt;
&lt;br /&gt;
==Marcos's Code==&lt;br /&gt;
This is code that a Rice student, Marcos Lee, wrote. I cleaned it and ran it. I have described the tables that I built and where they come from below. My code is located in:&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\MatchingEntrepsV3&lt;br /&gt;
&lt;br /&gt;
If you have issues understanding my explanation, go to this location and read the query. Most of them are straight forward. &lt;br /&gt;
===Describing Stacks Created in Code===&lt;br /&gt;
 CoPeopleBase:&lt;br /&gt;
 -Builds from copeople and titlelookup&lt;br /&gt;
 -Identifies what roles people played in their companies&lt;br /&gt;
&lt;br /&gt;
 namegender:&lt;br /&gt;
 -built from copeoplebase&lt;br /&gt;
 -identifies male/female/unknown&lt;br /&gt;
&lt;br /&gt;
 CoPeopleFull:&lt;br /&gt;
 -built from copeoplebase and namegender&lt;br /&gt;
 -builds more extensive information on executive including speficially what level of executive they are&lt;br /&gt;
&lt;br /&gt;
 CoPeopleKey:&lt;br /&gt;
 -built from CoPeopleFull&lt;br /&gt;
 -creates table where only executives with full primary keys are kept&lt;br /&gt;
&lt;br /&gt;
 CoPeopleSerial:&lt;br /&gt;
 -built from copeoplekey&lt;br /&gt;
 -keeps track of executives previous jobs at executive level&lt;br /&gt;
&lt;br /&gt;
 CoPoepleAgg:&lt;br /&gt;
 -built from copeoplekey and copeopleserial&lt;br /&gt;
 -gets extensive information on executives for each company&lt;br /&gt;
&lt;br /&gt;
 FundPeopleBae:&lt;br /&gt;
 -built from fundpeople&lt;br /&gt;
 -identifies male/female/doctor&lt;br /&gt;
 -hasperson column slightly weird because we can only have the lastname without prefix or first name and still have a 1 in column. Seems to be of little use/too broad&lt;br /&gt;
&lt;br /&gt;
 FundPeopleFull:&lt;br /&gt;
 -built from fundpeoplebase, namegender&lt;br /&gt;
 -adds in male/female &lt;br /&gt;
&lt;br /&gt;
 Fundpeopleagg:&lt;br /&gt;
 -built from fundpeoplefull&lt;br /&gt;
 -has aggregations of gender info for each fund&lt;br /&gt;
&lt;br /&gt;
 RoundLineJoinerLeanffWlistno:&lt;br /&gt;
 -built from rounlinejoinerleanff&lt;br /&gt;
 -adds listno to funds&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggFunds:&lt;br /&gt;
 -built from roundlinejoinerleanffwlistno and rounlineaggfirms&lt;br /&gt;
 -if there are two funds from one firm that invest in same portco, we choose only one and leave the others behind&lt;br /&gt;
&lt;br /&gt;
 RoundLineAggWExit:&lt;br /&gt;
 -built from roundlineaggfirms, portcoexitupdated, roundlineaggfunds&lt;br /&gt;
 -adds in exit information for each company in roundlineaggfirms&lt;br /&gt;
&lt;br /&gt;
 FirmPerf:&lt;br /&gt;
 -built from roundlineaggwexit&lt;br /&gt;
 -adds in various performance measures for a given firm &lt;br /&gt;
&lt;br /&gt;
 PortCoFundDemo:&lt;br /&gt;
 -built from roundlinejoinerleanffclean and fundpeopleagg&lt;br /&gt;
 -gives information on executives of funds who invested in the portcos&lt;br /&gt;
&lt;br /&gt;
 PortCoPeopleMaster:&lt;br /&gt;
 -built from PortCoMaster, PortCoIndustry, PortCoPatent, PortCoSBIR, copeoplagg, PortCoFundDemo, CPI, statelookupint&lt;br /&gt;
 -huge amount of data about companies and their executives&lt;br /&gt;
&lt;br /&gt;
 RoundAggDistBase:&lt;br /&gt;
 -built from portcogeo, firmbogeo, roundlineaggwexit&lt;br /&gt;
 -creates geographic points using long, lat from geocoding&lt;br /&gt;
&lt;br /&gt;
 RoundAggDist:&lt;br /&gt;
 -Built from roundaggdistbase&lt;br /&gt;
 -gets actual distances between portcos and firms. if branch office exists and distance is less than distance to firm chooses that also generates random number&lt;br /&gt;
&lt;br /&gt;
 FirmPeopleAgg:&lt;br /&gt;
 -built from roundlineaggfunds, fundpeopleagg&lt;br /&gt;
 -finds information on executives from different firms&lt;br /&gt;
&lt;br /&gt;
 PortCoMatchmaster:&lt;br /&gt;
 -built from portcopatent, porcoindustry, portcosbir, copeopleaggsimple, portcoid&lt;br /&gt;
 -gets all information together about portcos&lt;br /&gt;
&lt;br /&gt;
 FirmMatchMaster:&lt;br /&gt;
 -built from firmperf, firmvars, firmpeopleagg, firmid&lt;br /&gt;
 -gets all information together about firms&lt;br /&gt;
&lt;br /&gt;
 RoundLineMasterBase:&lt;br /&gt;
 -built from portcomatchmaster, firmmatchmaster, roundaggdist, roundlineaggwexit&lt;br /&gt;
 -builds large amount of information about portcos and firms spceifically info about exits and distances&lt;br /&gt;
&lt;br /&gt;
 MatchMostNumerous:&lt;br /&gt;
 -built from roundlinemasterbase&lt;br /&gt;
 -finds max number of portcos invested in by a firm that also invested in the company grouping by&lt;br /&gt;
&lt;br /&gt;
 MatchHighestRandom:&lt;br /&gt;
 -built from matchmostnumerous&lt;br /&gt;
 -if two firms that invested in one company had the same number of max port cos this randomly chooses one company&lt;br /&gt;
&lt;br /&gt;
 FirmActiveYearsCode20:&lt;br /&gt;
 -built from roundlinejoinerleanffclean, porcoindustry&lt;br /&gt;
 -adds firmname to industry code not exactly sure why distinct is used in query&lt;br /&gt;
&lt;br /&gt;
 RealMatchesCode20:&lt;br /&gt;
 -built from MatchHighestRandom, PortCoIndustry&lt;br /&gt;
 -real matches between portcos and firms that invested in them including the code20&lt;br /&gt;
&lt;br /&gt;
 SyntheticFirmSetBaseCode20:&lt;br /&gt;
 -built from realmatchescode20, firmactiveyarscode20&lt;br /&gt;
 -crossproduct of firms and portcos. finds firms that invested in same year as portco received first inv, firms invested in same type of company, and makes sure matches are unique&lt;br /&gt;
&lt;br /&gt;
 AllMatchKeys:&lt;br /&gt;
 -built from SyntheticFirmSetBaseCode20, RealMatchesCode20&lt;br /&gt;
 -combines synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAggDistBaseCode20:&lt;br /&gt;
 -built from allmatchkeys, portcogeo, firmbogeo&lt;br /&gt;
 -builds points for all portco, firm listings in allmatch keys&lt;br /&gt;
&lt;br /&gt;
 SynthRoundAddDistCode20:&lt;br /&gt;
 -built from synthroundaggdistvasecode20&lt;br /&gt;
 -finds actual distance between portcos and firms using installed extensions chooses branch offices if distance between portco and bo less than firm&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameInduBlowoutCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -gets every firm combination and checks whehter the companies that those firms invested in are in the same general industry&lt;br /&gt;
&lt;br /&gt;
 SynthFirmNameroundInduHistCode20:&lt;br /&gt;
 -built from SynthFirmnameInduBlowoutcode20&lt;br /&gt;
 -gets information by portco, firmname match about what the firms past investment patterns are&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthBaseCode20Portco:&lt;br /&gt;
 -built from Allmatchkeys, matchhighestrandom, synthroundaggdistcode20, sythnfirmnameroundinduhistcode20, synthfirmnameroundindutotalcode20, firmvars, copeopleaggsimple, portcomaster&lt;br /&gt;
 -builds a bunch of information about synthetic and real matches&lt;br /&gt;
&lt;br /&gt;
 SynthFirmnameRoundInduTotalCode20:&lt;br /&gt;
 -built from allmatchkeys, roundlinemasterbase&lt;br /&gt;
 -finds number of portcos in certain industries by firmnames&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthCode20Firms:&lt;br /&gt;
 -built with firmmatchmaster, allmatchkeys&lt;br /&gt;
 -matching a bunch of information to all firms&lt;br /&gt;
&lt;br /&gt;
 MasterWithSynthcode20:&lt;br /&gt;
 -built from masterwithsynthbasecode20portco, masterwithsynthcode20firms&lt;br /&gt;
 -gets a huge amount of info together on real and synthetic matches about firms and companies&lt;br /&gt;
&lt;br /&gt;
 MasterReals:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just real matches from code&lt;br /&gt;
&lt;br /&gt;
 MasterOneSynth:&lt;br /&gt;
 -built from masterwithsynthcode20&lt;br /&gt;
 -gets just one randomly chosen synthetic match between companies and firms&lt;br /&gt;
&lt;br /&gt;
 MasterRealOneSynth:&lt;br /&gt;
 -built from masteronesynth, masterreals&lt;br /&gt;
 -combines the real and one synth table&lt;br /&gt;
&lt;br /&gt;
==Ranking Tables and Graphs==&lt;br /&gt;
This is a slight detour from the creation of VCDB3. However, this is a cool process because you actually get to use the data you've been working with. This process is extensive, but the queries are easy to understand. If you wish to have deeper understanding of the process, read the code. It is located in:&lt;br /&gt;
&lt;br /&gt;
 E:McNair\Projects\VentureXpert Database\vcdb3\LoadingScripts\RoundRanking.SQL&lt;br /&gt;
&lt;br /&gt;
First you must create a table that has aggregate round information grouped by cities and round year. Since this is a little difficult to picture, I will attach the code.&lt;br /&gt;
 DROP TABLE roundleveloutput;&lt;br /&gt;
 CREATE TABLE roundleveloutput AS SELECT&lt;br /&gt;
 city, statecode, roundyear AS year,&lt;br /&gt;
 SUM(rndamtestm*seedflag) AS seedamnt,&lt;br /&gt;
 SUM(rndamtestm*earlyflag) AS earlyamnt,&lt;br /&gt;
 SUM(rndamtestm*laterflag) AS lateramnt,&lt;br /&gt;
 SUM(rndamtestm*growthflag) AS selamnt,&lt;br /&gt;
 SUM(growthflag*dealflag) AS numseldeals&lt;br /&gt;
 FROM round GROUP BY city, statecode, roundyear;&lt;br /&gt;
 --30028&lt;br /&gt;
&lt;br /&gt;
Next create a table that lists the all time SEL amount by city. Keep including the state code since this will ensure that you have the right city. City names are often repeated in different states. Next, create a table which lists unique city, state for every year since 1980. Then, build a table which matches portcos to the city, state, year blowout table for each year they were alive. This table should be relatively large since it lists companies once for every year they were alive up until the present. Then create a table that displays the number of companies alive in a city every year since 1980.  Then add in a table that lists all of the information you have built in tables previously based on city, state, year. Also add in population. Then you can run the ranking queries.&lt;br /&gt;
&lt;br /&gt;
For states follow the same general process but group by states not cities and states. &lt;br /&gt;
&lt;br /&gt;
If this explanation was not enough for you (it was not meant to be in depth) go to the location defined above and read the actual code. With the description I have given, you should be able to piece together what each query does.&lt;br /&gt;
&lt;br /&gt;
==Master Tables==&lt;br /&gt;
Throughout the creation of the database, there are inevitably some tables that are vital to create a solid foundation. The following tables are the master tables with a quick explanation:&lt;br /&gt;
* '''Companybasecore'''- The base table for portcos. This is data that was drawn directly from SDC and was not changed other than for cleaning purposes. Count: 48001&lt;br /&gt;
* '''BranchOfficeCore'''- The base table for branch offices. This is data drawn directly from SDC. Here only branch offices with distinct firm names are included. Count: 10032&lt;br /&gt;
* '''FirmBaseCore'''- The base table for firms. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 15437&lt;br /&gt;
* '''FundBaseCore'''- The base table for funds. This is also data taken directly from SDC and was not changed other than for cleaning purposes. Count: 28833&lt;br /&gt;
* '''IPOCleanNoDups''' - This is the clean table of IPOs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 2136&lt;br /&gt;
* '''IPONoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here. Be careful using this table. Count: 11149&lt;br /&gt;
* '''MACleanNoDups'''- This is the clean table of MAs after being run through the matcher against portcos. It was cleaned manually and had duplicates removed. Count: 7171&lt;br /&gt;
* '''MANoDups'''- This is the table before the cleaning process of matching to portcos. There could be problems with this table as we used an aggregate function here as well. Be careful using this table. Count: 119374&lt;br /&gt;
* '''Round'''- This is the master round table. It has SEL flags attached to it and has the most round info. RoundBaseClean is also a decent table but has less information. This table is your best bet for round information. Count: 151323&lt;br /&gt;
* '''RoundLineJoinerLeanFFClean'''- This is the master round table for joining purposes. It was cleaned and used for widespread joining purposes. Count: 163157&lt;br /&gt;
* '''CoPeople'''- This is the base table for PortCo people information. It was pulled directly from SDC. Count: 194359&lt;br /&gt;
* '''FirmBoGeo'''- This is the base table for firm/branch office geocoding. This table was cleaned and contains lat/long readings for firms and branch offices where the information was available. Count: 15437&lt;br /&gt;
* '''PortCoGeo'''- This is the base table for portco geocoding. Table was cleaned and contains lat/long reading for portcos where the Google API returned a valid reading. Count: 48001&lt;br /&gt;
* '''FirmPerf'''- This is a wide reaching table about the performance of firms. It was mainly used later in the project but is extremely useful. Count: 8336&lt;br /&gt;
* '''FundPeople'''- This is the base table for fund people information. It was pulled directly from SDC. Count: 328994.&lt;br /&gt;
* '''PortCoExitUpdated'''- This is the master exit table for portcos. The difference between this and PortCoExit is that Updated has two columns marking MAs and IPOs while the other has one column MAvsIPO. Use which ever one is more convenient. Count: 48001&lt;br /&gt;
* '''PortCoMaster'''- This table is great. There's a ton of information on PortCos including SEL flags, round amounts, and industry classifications. Count: 48001&lt;/div&gt;</summary>
		<author><name>Ed</name></author>
		
	</entry>
</feed>