Ok, so, you might know about JavaScript regular expressions. Well, here is a tutorial about them, but written by a 13 year old, so it isn't actually any good!
Regular expressions go between / characters. Here is an exampe, /hi/.
Ok, now then. Let's learn how to match the string abc. Well, that's quite simple.
/abc/. Yey! So putting letters next to each other makes them match one after the other.
Ok, now, after the second / we can put a g to make it match globally, that is, we can extract abc from xyzabcghi.
/abc/g.
Great! Eh?
What if we want to match either an A, a B, or a C, but not all three one after the other? Well, if you put them into these square brackets ([]) then you create what's called a character class, which is a group of characters, where any of them could be matched! Great, right?
/[abc]/g, now that matches a, b, and c. Great, right?
They are great, aren't they?
Well, I want to match a single digit! This should be easy, we already know how to make a character class.
/[0123456789]/g. Done, right?
Well, yes, it works. But it's a bit long, isn't it?
I wish there was a way of saying "a number between 0 and 9". Well, it turns out there is! Yey!
/[0-9]/g. Wow, that's much shorter. What if I want to match a digit, or a decimal point? Well, we can do that! /[0-9.]g/.
Huh, that looks a bit weird? What happened. Well, remember that [0-9] means [0123456789] so [0-9.] means [0123456789.].
That makes sense.
Can I do that, but without using all the numbers. Let's say I have a regular expression, [34567]. How can we shorten that?
Well, [3-7] is the answer! Yey!
What about letters, can we do the alphabet? Yes! [a-z] WOW!
So, now that we're going for letters, we might want to be able to not care about whether a letter is uppercase or lowercase.
The way we do that, is by putting an i after the /. So, let's say we have /abc/g which matches abc ONLY. If we do /abc/gi/ (or /abc/ig/, it doesn't matter), then we can match
abc(still)abCaBcaBCAbcAbCABcABCThat's way more possibilities!
Never, ever, underestimate the backslash. What it does, is, it gives characters that don't have special meaning a special meaning, and take away the special meaning from characters that do.
Let's do a quick example! /abc\[/g matches "abc[". Usually, [ means the beggining of a character class, but not if you put a \ before it!
And \, it has a special meaning, so if you want to match the string "abc\[" then you need to escape both the \ and the [.
So, we get, /abc\\\[/g. abc for the abc, \\ for the \ and \[ for the [.
So, we have already shortened our digit-matching code to [0-9]. Can we get shorter? As it turns out, if you put \d then the d gets some special meaning! It means "digit".
Let's try it out /abc\d/ is the same as /abc[0-9]/. Isn't this great? I, personally, think this is.
Even cooler, if you make the d a capital letter, then it negates its meaning. So, for example, \d means digit, \D means NOT a digit.
\b, a word boundary, that is, the end or start of a string; or the point before or after a space character that must be before or after a word-character (see\wabout word-characters). **Important: ** word boundaries are points of length zero where the change between words and word-boundaries occurs, and they don't match characters!\B, anything that isn't a word boundary\c..., it's complicated, and I wouldn't worry about it 😄. Note that this doesn't have a negative, and also that there are two characters after the backslash, which is unusual.\da digit,/[0-9]/is the workaround\Danything that isn't a digit\fform feed. This is a character. It doesn't have a negative.\nis a newline character. It's what seperates lines on most operating systems. Doesn't have a negative.\ris a carrige return, it's a bit like the\ncharater.\sis a space-character, and it includes the tab character, the space character, the newline character, the carriage return character, and many more.\Sis everything that isn't a space character\tis the tab character, you know, the one that takes out about 4 spaces worth of gap.\vsomething called a "vertical tab". I know, right?\w, a word-character! This is the same as/[a-z0-9_]/i(or,/[a-zA-Z0-9_]/).\Weverything that isn't a word character\<number goes here>we'll cover these later!\0is aNULcharacter, which you shouldn't need to worry about.- there are a couple more, but we will cover those later.