2021년 12월 15일
이 글은 다음 언어로만 작성되어 있습니다. English, Español, Français, Italiano, 日本語, Русский, Türkçe, Українська, 简体中文. 한국어 번역에 참여해주세요.

Word boundary: \b

A word boundary \b is a test, just like ^ and $.

When the regexp engine (program module that implements searching for regexps) comes across \b, it checks that the position in the string is a word boundary.

There are three different positions that qualify as word boundaries:

  • At string start, if the first string character is a word character \w.
  • Between two characters in the string, where one is a word character \w and the other is not.
  • At string end, if the last string character is a word character \w.

For instance, regexp \bJava\b will be found in Hello, Java!, where Java is a standalone word, but not in Hello, JavaScript!.

alert( "Hello, Java!".match(/\bJava\b/) ); // Java
alert( "Hello, JavaScript!".match(/\bJava\b/) ); // null

In the string Hello, Java! following positions correspond to \b:

So, it matches the pattern \bHello\b, because:

  1. At the beginning of the string matches the first test \b.
  2. Then matches the word Hello.
  3. Then the test \b matches again, as we’re between o and a comma.

The pattern \bHello\b would also match. But not \bHell\b (because there’s no word boundary after l) and not Java!\b (because the exclamation sign is not a wordly character \w, so there’s no word boundary after it).

alert( "Hello, Java!".match(/\bHello\b/) ); // Hello
alert( "Hello, Java!".match(/\bJava\b/) );  // Java
alert( "Hello, Java!".match(/\bHell\b/) );  // null (no match)
alert( "Hello, Java!".match(/\bJava!\b/) ); // null (no match)

We can use \b not only with words, but with digits as well.

For example, the pattern \b\d\d\b looks for standalone 2-digit numbers. In other words, it looks for 2-digit numbers that are surrounded by characters different from \w, such as spaces or punctuation (or text start/end).

alert( "1 23 456 78".match(/\b\d\d\b/g) ); // 23,78
alert( "12,34,56".match(/\b\d\d\b/g) ); // 12,34,56
Word boundary \b doesn’t work for non-latin alphabets

The word boundary test \b checks that there should be \w on the one side from the position and "not \w" – on the other side.

But \w means a latin letter a-z (or a digit or an underscore), so the test doesn’t work for other characters, e.g. cyrillic letters or hieroglyphs.

과제

The time has a format: hours:minutes. Both hours and minutes has two digits, like 09:00.

Make a regexp to find time in the string: Breakfast at 09:00 in the room 123:456.

P.S. In this task there’s no need to check time correctness yet, so 25:99 can also be a valid result.

P.P.S. The regexp shouldn’t match 123:456.

The answer: \b\d\d:\d\d\b.

alert( "Breakfast at 09:00 in the room 123:456.".match( /\b\d\d:\d\d\b/ ) ); // 09:00
튜토리얼 지도

댓글

댓글을 달기 전에 마우스를 올렸을 때 나타나는 글을 먼저 읽어주세요.
  • 추가 코멘트, 질문 및 답변을 자유롭게 남겨주세요. 개선해야 할 것이 있다면 댓글 대신 이슈를 만들어주세요.
  • 잘 이해되지 않는 부분은 구체적으로 언급해주세요.
  • 댓글에 한 줄짜리 코드를 삽입하고 싶다면 <code> 태그를, 여러 줄로 구성된 코드를 삽입하고 싶다면 <pre> 태그를 이용하세요. 10줄 이상의 코드는 plnkr, JSBin, codepen 등의 샌드박스를 사용하세요.