Don't talk much nonsense , Direct code :
package top.yangxianyang.test; import java.util.regex.Matcher; import
java.util.regex.Pattern; import org.junit.Test; public class Test1 { // matching
@Test public void match() { String qq = "2017-09-19"; //
The format of the validation date is YYYY-MM-DD The regular expression for is String regex =
"(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)
"; // The format of the validation date is DD/MM/YYYY The regular expression for is String regex2 =
"(((0[1-9]|[12][0-9]|3[01])/((0[13578]|1[02]))|((0[1-9]|[12][0-9]|30)/(0[469]|11))|(0[1-9]|[1][0-9]|2[0-8])/(02))/([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3}))|(29/02/(([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00)))";
boolean flag = qq.matches(regex); if (flag) System.out.println(qq + "...is
ok"); else System.out.println(qq + "... wrongful "); } // cutting @Test public void
splitDemo() { String str = "avgabbageigaglsdabc"; String regex = "a";//
according to a To cut String[] arr = str.split(regex); System.out.println(arr.length); for
(String s : arr) { System.out.println(s); } } // replace @Test public void
replaceAllDemo() { String str = "wer1389980000t545y1234564uiod234345675f"; //
Make the continuous length in the string greater than 5 Replace with #. str = str.replaceAll("\\d{5,}", "#");
System.out.println(str); } // Take out the substring that conforms to the rule in the string @Test public void getDemo() {
String str = "A regular expression, specified as a string,instance of this
class."; System.out.println(str); String reg = "\\b[a-z]{2}\\b";// Match two letter words
// Encapsulating rules as objects . Pattern p = Pattern.compile(reg); // Associate the regular object with the string to be used . Get matcher object .
Matcher m = p.matcher(str); // actually String Class matches method . It's used Pattern and Matcher Object .
// It's just that String After encapsulation , It's easy to use . But the function is single . // m.find(); Apply rules to strings , And find the substring in accordance with the rules .
while(m.find()) { // m.group(); Used to get the result after matching . System.out.println(m.group()); //
start() The starting subscript of the character ( contain ) //end() The ending subscript of the character ( Does not contain )
System.out.println(m.start()+"...."+m.end()); } } }
regular expression syntax
stay Java in ,\ express : I want to insert a backslash of a regular expression , So the following characters have special meaning .
therefore , In other languages ( as Perl), A backslash \ It is enough to have the function of escape , And in the Java
Two backslashes are required in regular expressions in order to be parsed as escape functions in other languages . It can also be simply understood that Java In the regular expression of , Two \ For one of the other languages
\, That's why the regular expression for one digit is \d, And a common backslash is \\.
character explain
\ Mark the next character as a special character , text , Reverse reference or octal escape character . for example ,“n" Matching characters "n”."\n" Match line breaks . sequence "\\“ matching ”\","\(“ matching ”(".
^ Matches the start of the input string . If the RegExp Object's Multiline attribute ,^ Will also be associated with "\n" or "\r" Position matching after .
$ Matches the end of the input string . If the RegExp Object's Multiline attribute ,$ Will also be associated with "\n" or "\r" Previous position matching .
* Matches the preceding character or subexpression zero or more times . for example ,zo* matching "z" and "zoo".* Equivalent to {0,}.
* Matches the preceding character or subexpression one or more times . for example ,“zo+“ And "zo" and "zoo" matching , But with "z" Mismatch .+ Equivalent to {1,}.
? Matches the preceding character or subexpression zero or once . for example ,“do(es)?“ matching "do" or "does" Medium "do”.? Equivalent to {0,1}.
{n} n Is a nonnegative integer . Exactly match n second . for example ,“o{2}“ And "Bob" Medium "o" Mismatch , But with "food" Two of them "o" matching .
{n,} n Is a nonnegative integer . At least match n second . for example ,“o{2,}“ Mismatch "Bob" Medium "o”, And match "foooood" All in
o.“o{1,}“ Equivalent to "o+”.“o{0,}“ Equivalent to "o*”.
{n,m} m and n Is a nonnegative integer , among n <= m. Match at least n second , at most m second . for example ,“o{1,3}“ matching "fooooood" The first three in
o.‘o{0,1}’ Equivalent to ‘o?’. be careful : You cannot insert spaces between commas and numbers .
?
When this character follows any other qualifier (*,+,?{n},{n,},{n,m}) After , The matching pattern is " Not greedy ”.“ Not greedy " Pattern matching , As short as possible , And the default " Greedy " Pattern matching , String as long as possible . for example , In string "oooo" in ,“o+?“ Match only single "o”, and "o+“ Match all "o”.
. Matched Division ”\r\n" Any single character other than . To match, include ”\r\n" Any character within , Please use such as ”[\s\S]“ A pattern like that .
(pattern) matching pattern And capture the matching subexpression . have access to $0…$9 Attribute from result " matching " Retrieve the captured matches in the collection . To match bracket characters (
), Please use ”(“ perhaps ”)”.
(?:pattern) matching pattern However, the subexpression of the match is not captured , That is, it is a non capture match , Do not store matches for later use . This is useful for "or" character (|)
The case for combining pattern components is useful . for example ,'industr(?:y|ies) It's better than that ‘industry|industries’ More economical expression .
(?=pattern) Subexpression that performs a forward prediction look ahead search , The expression match is at match pattern
The starting point of the string . It is a non capture match , That is, you cannot capture matches for later use . for example ,‘Windows (?=95|98|NT|2000)’ matching "Windows
2000" Medium "Windows”, But it doesn't match "Windows
3.1" Medium "Windows”. Prediction first does not take up characters , After a match occurs , The search for the next match follows the previous match , Not after the characters that make up the prediction first .
(?!pattern) Subexpression that performs a reverse prediction look ahead search , The expression match is not in a match pattern
The search string for the starting point of the string . It is a non capture match , That is, you cannot capture matches for later use . for example ,‘Windows (?!95|98|NT|2000)’
matching "Windows 3.1" Medium “Windows”, But it doesn't match "Windows
2000" Medium "Windows”. Prediction first does not take up characters , After a match occurs , The search for the next match follows the previous match , Not after the characters that make up the prediction first .
x|y matching x or y. for example ,‘z|food’ matching "z" or "food”.’(z|f)ood’ matching "zood" or "food”.
[xyz] character set . Match any character contained . for example ,”[abc]“ matching "plain" Medium "a”.
[^xyz] Reverse character set . Matches any characters that are not included . for example ,”[^abc]“ matching "plain" in "p”,“l”,“i”,“n”.
[a-z] character in range . Matches any character in the specified range . for example ,"[a-z]“ matching "a" reach "z" Any lowercase letter in the range .
[^a-z] Reverse range character . Matches any character that is not in the specified range . for example ,”[^a-z]“ Match any not in "a" reach "z" Any character in the range .
\b Match a word boundary , The position between the word and the space . for example ,“er\b" matching "never" Medium "er”, But it doesn't match "verb" Medium "er”.
\B Non word boundary matching .“er\B" matching "verb" Medium "er”, But it doesn't match "never" Medium "er".
\cx matching x Indicated control character . for example ,\cM matching Control-M Or carriage return .x The value of must be in A-Z or a-z between . If not , It is assumed that c
namely "c" The character itself .
\d Digit character matching . Equivalent to [0-9].
\D Non numeric character matching . Equivalent to [^0-9].
\f Page break matching . Equivalent to \x0c and \cL.
\n Newline matching . Equivalent to \x0a and \cJ.
\r Match a carriage return . Equivalent to \x0d and \cM.
\s Match any white space characters , Include spaces , Tab , Page break, etc . And [ \f\n\r\t\v] equivalent .
\S Matches any non white space characters . And [^ \f\n\r\t\v] equivalent .
\t Tab matching . And \x09 and \cI equivalent .
\v Vertical tab matching . And \x0b and \cK equivalent .
\w Match any word type character , Include underscores . And "[A-Za-z0-9_]“ equivalent .
\W Matches any non word character . And ”[^A-Za-z0-9_]“ equivalent .
\xn matching n, Here's n
It's a hexadecimal escape code . The hexadecimal escape code must be exactly two digits long . for example ,”\x41" matching "A"."\x041" And "\x04"&“1" equivalent . Allow use in regular expressions
ASCII code .
\num matching num, Here's num Is a positive integer . Reverse reference to capture match . for example ,”(.)\1" Matches two consecutive identical characters .
\n Identifies an octal escape code or reverse reference . If \n At least in front n Capture subexpressions , that n It's a reverse reference . otherwise , If n It's an octal number (0-7), that
n It's an octal escape code .
\nm Identifies an octal escape code or reverse reference . If \nm At least in front nm Capture subexpressions , that nm It's a reverse reference . If \nm At least in front n Catches , be
n It's a reverse reference , Followed by character m. If neither of the preceding conditions exists , be \nm Match octal value nm, among n and m The number is octal (0-7).
\nml When n It's an octal number (0-3),m and l It's an octal number (0-7) Time , Match octal escape code nml.
\un matching n, among n It is expressed in four hexadecimal numbers Unicode character . for example ,\u00A9 Match copyright symbols (©).
according to Java Language Specification Requirements of ,Java The backslash in the string of the source code is interpreted as Unicode
Escape or other character escape . Therefore, you must use two backslashes in the string literal value , Indicates that the regular expression is protected , Not to be Java Bytecode compiler interpretation . for example , When interpreted as a regular expression , string literal
“\b” Match with a single backspace character , and “\b” Match word boundaries . string literal “(hello)” It's illegal , Will cause compile time errors ; To match the string (hello)
matching , String literal value must be used “\(hello\)”.
Technology
Daily Recommendation