The JavaScript version.
Search for: 1
- "/example/":
/\/example\/[a-z]+/i - Switch words in a string
let re = /(\w+)\s(\w+)/;
let str = 'John Smith';
let newstr = str.replace(re, '$2, $1');
console.log(newstr); // Smith, John- Using an inline function that modifies the matched characters
function styleHyphenFormat(propertyName) {
function upperToHyphenLower(match, offset, string) {
return (offset > 0 ? '-' : '') + match.toLowerCase();
}
return propertyName.replace(/[A-Z]/g, upperToHyphenLower);
}
console.log(styleHyphenFormat('borderTop')) // border-top- Converting Fahrenheit to Celsius
function f2c(x) {
function convert(str, p1, offset, s) {
return ((p1 - 32) * 5/9) + 'C';
}
let s = String(x);
let test = /(-?\d+(?:\.\d*)?)F\b/g; // (?:...) is a non-capturing group
return s.replace(test, convert);
}A word boundary (\b) is a zero width match that can match:
- Between a word character (
\w) and a non-word character (\W) or - Between a word character and the start or end of the string.
\B is the inverse of \b, also zero width. It can match:
- Between two word characters.
- Between two non-word characters.
- Between a non-word character and the start or end of the string.
- The empty string.
Finding a non-word boundary? Just find the word boundaries, remove them, and everything left are basically non-word boundaries
- Metacharacters
.: Any one character except newline, same as[^\n].\d,\D: Any one digit/non-digit character (where digits are[0-9]).\w,\W: Any one word/non-word character. For ASCII, word characters are[a-zA-Z0-9_].\s,\S: Any one space/non-space character. For ASCII, whitespace characters are[ \n\r\t\f].
- Occurrence Indicators
+: One or more, e.g.[0-9]+matches 1 or more digits, such as "123", "0000".*: Zero or more (accepts the above + empty strings).?: Zero or one (optional), e.g., [+-]? matches an optional "+", "-", or an empty string.{}{m,n}:mton(both inclusive).{m}: Exactlymtimes.{m,}:mor more times (m+).
- Position Anchors
^: Start of line, e.g.^[0-9]$matches a numeric string.$: End of line\b: Boundary of word, i.e., start-of-word or end-of-word. E.g., \bcat\b matches the word "cat" in the input string.\B: Inverse of\b, i.e. non-start-of-word or non-end-of-word.
- Parenthesized Back References (Capture Group)
(): Creates a capture group for extracting a substring or using a back reference.- Use
$1,$2, ... (JS, Java, Perl), or\1,\2, ... (Python) to retrieve the back references in sequential order.
- Character Class (or Bracket List)
[][...]: Accept any one of the character within the bracket.[.-.]: Accept any one of the characters in the range, e.g.[0-9],[A-Za-z].[^...]: Rejects any one of the character, e.g.[^0-9]matches any non-digit.- Only ^, -, ], \ require escape sequence inside the bracket list.
|: OR operator, e.g.four|4accepts "four" or "4".\: Escape sequence to accept a char with special meaning in regex.- Regex recognizes common escape sequences such as
\nfor newline,\tfor tab,\rfor carriage-return,\nnnfor a up to 3-digit octal number,\xhhfor a two-digit hex code,\uhhhhfor a 4-digit Unicode,\uhhhhhhhhfor a 8-digit Unicode.
- Regex recognizes common escape sequences such as
- Laziness
*?,+?,??,{m,n}?,{m,}?: Curbs greediness for repetition operators.
- Capturing matched pattern
$&: Represents the matched word.
Footnotes
-
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#examples ↩
-
https://stackoverflow.com/questions/4541573/what-are-non-word-boundary-in-regex-b-compared-to-word-boundary ↩
-
https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#:~:text=On%20the%20other%20hand%2C%20the,%5CD%20or%20non%2Ddigit ↩