Last active
February 25, 2022 13:43
-
-
Save regexyl/f465d8362c2b7c77284b1455b1f8c5ed to your computer and use it in GitHub Desktop.
Revisions
-
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -77,6 +77,7 @@ Finding a non-word boundary? Just find the word boundaries, remove them, and eve - Parenthesized Back References (Capture Group) - `()`: Creates a capture group for extracting a substring or using a back reference. - Use `$1`, `$2`, ... (JS, Java, Perl), or `\1`, `\2`, ... (Python) to retrieve the back references in sequential order. - `(?:...)`: A non-capturing group; creates a capture group that will be omitted from the resulting list of captures. [^so-cg] - Character Class (or Bracket List) - `[]` - `[...]`: Accept any *one* of the character within the bracket. @@ -96,6 +97,7 @@ Finding a non-word boundary? Just find the word boundaries, remove them, and eve [^ntu-guide]: https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit [^moz-eg]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#examples [^so-boundary]: https://stackoverflow.com/questions/4541573/what-are-non-word-boundary-in-regex-b-compared-to-word-boundary [^so-cg]: Lu, S. (2014, January 29). Use of capture groups in String.split(). Stack Overflow. https://stackoverflow.com/questions/21419530/use-of-capture-groups-in-string-split ### Awesome Resources - https://riptutorial.com/regex -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 6 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -32,6 +32,12 @@ function f2c(x) { return s.replace(test, convert); } ``` 5. Capturing the matched pattern ```js const regexChars = /[\\^$.*+?()[\]{}|]/g; const str = 'as[b*'; console.log(str.replace(regexChars, `\\$&`)) // 'as\\[b\\*' ``` ## Possible Trip-Ups ### `\b\` and `\B`: Matching [non-]word boundaries -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -50,7 +50,6 @@ Finding a non-word boundary? Just find the word boundaries, remove them, and eve [^so-boundary] ## Syntax - Metacharacters - `.`: Any *one* character except newline, same as `[^\n]`. - `\d`, `\D`: Any *one* digit/non-digit character (where digits are `[0-9]`). @@ -83,6 +82,8 @@ Finding a non-word boundary? Just find the word boundaries, remove them, and eve - Regex recognizes common escape sequences such as `\n` for newline, `\t` for tab, `\r` for carriage-return, `\nnn` for a up to 3-digit octal number, `\xhh` for a two-digit hex code, `\uhhhh` for a 4-digit Unicode, `\uhhhhhhhh` for a 8-digit Unicode. - Laziness - `*?`, `+?`, `??`, `{m,n}?`, `{m,}?`: Curbs greediness for repetition operators. - Capturing matched pattern - `$&`: Represents the matched word. [^ntu-guide] -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 4 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -34,17 +34,19 @@ function f2c(x) { ``` ## Possible Trip-Ups ### `\b\` and `\B`: Matching [non-]word boundaries A word boundary (`\b`) is a *zero width match* that can match: - Between a word character (`\w`) and a non-word character (`\W`) or - Between a word character and the start or end of the string. `\B` is the inverse of `\b`, also *zero width*. It can match: - Between two word characters. - Between two non-word characters. - Between a non-word character and the start or end of the string. - The empty string. Finding a non-word boundary? Just find the word boundaries, remove them, and everything left are basically non-word boundaries [^so-boundary] ## Syntax -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 20 additions and 4 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,7 +2,7 @@ The JavaScript version. ## Frequent Examples Search for: [^moz-eg] 1. "/example/": `/\/example\/[a-z]+/i` 2. Switch words in a string ```js @@ -33,6 +33,20 @@ function f2c(x) { } ``` ## Possible Trip-Ups ### `\b\` and `\B`: Matching [non-]boundary characters A word boundary (`\b`) is a *zero width match* that can match: - Between a word character (`\w`) and a non-word character (`\W`) or - Between a word character and the start or end of the string. `\B` is the inverse of `\b`. It can match: - Between two word characters. - Between two non-word characters. - Between a non-word character and the start or end of the string. - The empty string. [^so-boundary] ## Syntax ### - Metacharacters @@ -68,9 +82,11 @@ function f2c(x) { - Laziness - `*?`, `+?`, `??`, `{m,n}?`, `{m,}?`: Curbs greediness for repetition operators. [^ntu-guide] [^ntu-guide]: https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit [^moz-eg]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#examples [^so-boundary]: https://stackoverflow.com/questions/4541573/what-are-non-word-boundary-in-regex-b-compared-to-word-boundary ### Awesome Resources - https://riptutorial.com/regex -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 3 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -71,3 +71,6 @@ function f2c(x) { ## References - https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#switching_words_in_a_string ### Awesome Resources - https://riptutorial.com/regex -
regexyl revised this gist
Feb 25, 2022 . 1 changed file with 38 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,7 +2,36 @@ The JavaScript version. ## Frequent Examples Search for: 1. "/example/": `/\/example\/[a-z]+/i` 2. Switch words in a string ```js let re = /(\w+)\s(\w+)/; let str = 'John Smith'; let newstr = str.replace(re, '$2, $1'); console.log(newstr); // Smith, John ``` 3. Using an inline function that modifies the matched characters ```js function styleHyphenFormat(propertyName) { function upperToHyphenLower(match, offset, string) { return (offset > 0 ? '-' : '') + match.toLowerCase(); } return propertyName.replace(/[A-Z]/g, upperToHyphenLower); } console.log(styleHyphenFormat('borderTop')) // border-top ``` 4. Converting Fahrenheit to Celsius ```js function f2c(x) { function convert(str, p1, offset, s) { return ((p1 - 32) * 5/9) + 'C'; } let s = String(x); let test = /(-?\d+(?:\.\d*)?)F\b/g; // (?:...) is a non-capturing group return s.replace(test, convert); } ``` ## Syntax ### @@ -12,7 +41,7 @@ The JavaScript version. - `\w`, `\W`: Any *one* word/non-word character. For ASCII, word characters are `[a-zA-Z0-9_]`. - `\s`, `\S`: Any *one* space/non-space character. For ASCII, whitespace characters are `[ \n\r\t\f]`. - Occurrence Indicators - `+`: One or more, e.g. `[0-9]+` matches 1 or more digits, such as "123", "0000". - `*`: Zero or more (accepts the above + empty strings). - `?`: Zero or one (optional), e.g., [+-]? matches an optional "+", "-", or an empty string. - `{}` @@ -24,8 +53,9 @@ The JavaScript version. - `$`: End of line - `\b`: Boundary of word, i.e., start-of-word or end-of-word. E.g., \bcat\b matches the word "cat" in the input string. - `\B`: Inverse of `\b`, i.e. non-start-of-word or non-end-of-word. - Parenthesized Back References (Capture Group) - `()`: Creates a capture group for extracting a substring or using a back reference. - Use `$1`, `$2`, ... (JS, Java, Perl), or `\1`, `\2`, ... (Python) to retrieve the back references in sequential order. - Character Class (or Bracket List) - `[]` - `[...]`: Accept any *one* of the character within the bracket. @@ -35,6 +65,9 @@ The JavaScript version. - `|`: OR operator, e.g. `four|4` accepts "four" or "4". - `\`: Escape sequence to accept a char with special meaning in regex. - Regex recognizes common escape sequences such as `\n` for newline, `\t` for tab, `\r` for carriage-return, `\nnn` for a up to 3-digit octal number, `\xhh` for a two-digit hex code, `\uhhhh` for a 4-digit Unicode, `\uhhhhhhhh` for a 8-digit Unicode. - Laziness - `*?`, `+?`, `??`, `{m,n}?`, `{m,}?`: Curbs greediness for repetition operators. ## References - https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#switching_words_in_a_string -
regexyl created this gist
Feb 24, 2022 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,40 @@ # Regex Cheatsheet The JavaScript version. ## Frequent Examples ## Syntax ### - Metacharacters - `.`: Any *one* character except newline, same as `[^\n]`. - `\d`, `\D`: Any *one* digit/non-digit character (where digits are `[0-9]`). - `\w`, `\W`: Any *one* word/non-word character. For ASCII, word characters are `[a-zA-Z0-9_]`. - `\s`, `\S`: Any *one* space/non-space character. For ASCII, whitespace characters are `[ \n\r\t\f]`. - Occurrence Indicators - `+`: One or more, e.g. `[0-9]+` matches 1 or more digits, such as "123", "0000". - `*`: Zero or more (accepts the above + empty strings). - `?`: Zero or one (optional), e.g., [+-]? matches an optional "+", "-", or an empty string. - `{}` - `{m,n}`: `m` to `n` (both inclusive). - `{m}`: Exactly `m` times. - `{m,}`: `m` or more times (`m+`). - Position Anchors - `^`: Start of line, e.g. `^[0-9]$` matches a numeric string. - `$`: End of line - `\b`: Boundary of word, i.e., start-of-word or end-of-word. E.g., \bcat\b matches the word "cat" in the input string. - `\B`: Inverse of `\b`, i.e. non-start-of-word or non-end-of-word. - Parenthesized Back References - `()`: - Character Class (or Bracket List) - `[]` - `[...]`: Accept any *one* of the character within the bracket. - `[.-.]`: Accept any *one* of the characters in the range, e.g. `[0-9]`, `[A-Za-z]`. - `[^...]`: Rejects any *one* of the character, e.g. `[^0-9]` matches any non-digit. - Only ^, -, ], \ require escape sequence inside the bracket list. - `|`: OR operator, e.g. `four|4` accepts "four" or "4". - `\`: Escape sequence to accept a char with special meaning in regex. - Regex recognizes common escape sequences such as `\n` for newline, `\t` for tab, `\r` for carriage-return, `\nnn` for a up to 3-digit octal number, `\xhh` for a two-digit hex code, `\uhhhh` for a 4-digit Unicode, `\uhhhhhhhh` for a 8-digit Unicode. ## References - https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit.