Difference between revisions of "Regular Expression Cheat Sheet"

From edegan.com
Jump to navigation Jump to search
Line 43: Line 43:
 
#Repeated above steps until all 2015 patents are in TextPad
 
#Repeated above steps until all 2015 patents are in TextPad
 
#Exported data to Excel and then Removed Duplicates (Data --> Remove Duplicates)
 
#Exported data to Excel and then Removed Duplicates (Data --> Remove Duplicates)
 +
[[Category:Internal]]

Revision as of 18:20, 7 October 2016

This is a page for regular expression hacks. Chronicle your exploits so that others can benefit from your ingenuity!

Useful RegExes

Pattern    Matches
------------------
\t         tab
\n         newline
^          start of line
$          end of line
.          any character
*          any number of times
+          1 or more times
?          0 or once
\s         space
\d         number
[0-9]      any number (once)
[a-z]      any letter (once)
[a-Z]      any letter (case insensitive]
abc        abc
[a|b|c]    a or b or c
{1,3}      1, 2, or 3 times in a row
{3,}       3 or more times
()         captures whatever is in the bracktets
\          escape the next thing (e.g. \} matches })


Lex Machina

Task

Use Patent Portfolio Report to pull unique patent numbers for patents litigated in 2015

Steps Taken

  1. Filtered Lex Machina until total patents were under 2000.
  2. Lex Machina ran its Patent Portfolio Report
  3. Ctrl-A to select all, then pasted into TextPad
    • Patent numbers were the first word in every line
  1. Used replace command (F8) to find "(^.+?)\s.*$" and replace with "\1"
    • Make sure "regular expression" is checked
  1. That left only the patent numbers
  2. Repeated above steps until all 2015 patents are in TextPad
  3. Exported data to Excel and then Removed Duplicates (Data --> Remove Duplicates)