Author/Creator: Jan Goyvaerts
Curated By: Toby Hodges
Field(s): regular expressions, general programming
DescriptionGreat for easing the pain of working with large text files e.g. GFF! Regular expressions are a way of specifying patterns to match in text, like runs of numbers, nucleotide/amino acid sequences, gene names/IDs etc.
A general introduction to the wonderful world of regular expressions. The website is not particularly pretty, and the terminology can be confusing, but this is a well laid-out and thorough tutorial covering everything from the basics right the way through to advanced regular expression construction.
Used in conjunction with a regular expression visualisation tool such as Regexper or Debuggex, you should be quite quickly able to get the hang of writing some expressions that will make text mining and data extraction tasks much faster. The same webpage also offers a second tutorial, covering replacement strings – a complementary topic that will help you to learn how to quickly perform complex search-and-replace operations on huge volumes of text/data.
Software: At minimum: a decent text editor (Notepad++, TextWrangler, etc) with regex/grep searching option. Perl is excellent for this, and Python, R and most other programming languages also provide regex functionality too.
Level: Beginner, Intermediate
Format: Online (browser-based)
Time Required: 2-8 hours
Prerequisites: See “Software” in the tutorial description.
Last Curated: 08/29/2016