当前位置: 开发笔记 > 编程语言 > 正文

Java正则表达式的混乱[重复]-ConfusioninJavaRegularExpression[duplicate]

作者：傻傻的笑没心没肺wy | 来源：互联网 | 2023-09-18 05:19

Thisquestionalreadyhasananswerhere:这个问题在这里已有答案:Regexplusvsstardifference?9

This question already has an answer here:

这个问题在这里已有答案:

Regex plus vs star difference? 9 answers

正则表达加vs明星差异? 9个答案

What does these following two regular expression means?

以下两个正则表达式意味着什么?

.*? and .+?

Actually I understood usage these Quantifiers i.e.

其实我理解使用这些量词,即

'.' -> Any character
'*' -> 0 or more times
'+' -> once or more times
'?' -> 0 or 1

Indeed, I am literally confused!!! about using .*? and .+?.Could anybody show up with proper examples for these cases.

的确,我真的很困惑!关于使用。*?和。+?。任何人都可以出现这些案例的适当例子。

And you'r most welcome to share good links that presents useful examples practices. Thanks in advance.

并且非常欢迎您分享介绍有用示例实践的良好链接。提前致谢。

3 个解决方案

#1

To break down we have:

为了打破我们:

. - Any character
* - Any number of times
? - That is consumed reluctantly

. - Any character
+ - At least once
? - That is consumed reluctantly

A reluctant or "non-greedy" quantifier (the '?') matches as little as possible in order to find a match. A more in-depth look at qantifiers (greedy, reluctant and possessive) can be found here

一个不情愿或“非贪婪”的量词('?')尽可能少地匹配以找到匹配。可以在这里找到更深入的qantifiers(贪婪,不情愿和占有欲)

#2

.*? and .+? are Reluctant quantifiers .

。*?和。+?不情愿的量词。

They start at the beginning of the input string, then reluctantly eat one character at a time looking for a match. The last thing they try is the entire input string.

它们从输入字符串的开头开始,然后不情愿地一次吃一个字符寻找匹配。他们尝试的最后一件事是整个输入字符串。

Consider the code :

考虑一下代码:

        String lines="some";
        String REGEX=".+?";
        Pattern pattern=Pattern.compile(REGEX);
        Matcher matcher =pattern.matcher(lines);
        while(matcher.find()){
            String result=matcher.group();
            System.out.println("RESULT of .+? : "+result);
            System.out.println("RESULT LENGTH : "+result.length());
        }
        System.out.println(lines);
        String REGEX1=".*?";
        Pattern pattern1=Pattern.compile(REGEX1);
        Matcher matcher1 =pattern1.matcher(lines);
        while(matcher1.find()){
            int start=matcher1.start() ;
            int end=matcher1.end() ;
            String result=matcher1.group();
            System.out.println("RESULT of .*? : "+result);
            System.out.println("RESULT LENGTH : "+result.length() +" ,  start "+ start+" end :"+end);
        }

OUTPUT:

RESULT of .+? : s
RESULT LENGTH : 1
RESULT of .+? : o
RESULT LENGTH : 1
RESULT of .+? : m
RESULT LENGTH : 1
RESULT of .+? : e
RESULT LENGTH : 1
some
RESULT of .*? : 
RESULT LENGTH : 0 ,  start 0 end :0
RESULT of .*? : 
RESULT LENGTH : 0 ,  start 1 end :1
RESULT of .*? : 
RESULT LENGTH : 0 ,  start 2 end :2
RESULT of .*? : 
RESULT LENGTH : 0 ,  start 3 end :3
RESULT of .*? : 
RESULT LENGTH : 0 ,  start 4 end :4

.+? will try to find a match in each character and it matches (Length 1).

。+?将尝试在每个字符中找到匹配项并匹配(长度1)。

.*? will try to find match in each character or nothing . And it matches with empty string at each character.

。*?将尝试在每个角色中找到匹配或什么也不做。并且它与每个字符的空字符串匹配。

#3

To illustrate, consider the input string xfooxxxxxxfoo.

为了说明,请考虑输入字符串xfooxxxxxxfoo。

Enter your regex: .*foo  // greedy quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo  // reluctant quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .*+foo // possessive quantifier
Enter input string to search: xfooxxxxxxfoo
No match found.

The first example uses the greedy quantifier .* to find "anything", zero or more times, followed by the letters "f" "o" "o". Because the quantifier is greedy, the .* portion of the expression first eats the entire input string. At this point, the overall expression cannot succeed, because the last three letters ("f" "o" "o") have already been consumed. So the matcher slowly backs off one letter at a time until the rightmost occurrence of "foo" has been regurgitated, at which point the match succeeds and the search ends.

第一个例子使用贪心量词。*来找到“任何”,零次或多次,然后是字母“f”“o”“o”。因为量词是贪婪的,所以表达式的。*部分首先会占用整个输入字符串。此时,整体表达式不能成功,因为已经消耗了最后三个字母(“f”“o”“o”)。因此,匹配器一次缓慢地退回一个字母,直到最右边的“foo”被反刍,此时匹配成功并且搜索结束。

The second example, however, is reluctant, so it starts by first consuming "nothing". Because "foo" doesn't appear at the beginning of the string, it's forced to swallow the first letter (an "x"), which triggers the first match at 0 and 4. Our test harness continues the process until the input string is exhausted. It finds another match at 4 and 13.

然而,第二个例子是不情愿的,所以它首先消耗“没有”。因为“foo”没有出现在字符串的开头,所以它被强制吞下第一个字母(“x”),这会在0和4处触发第一个匹配。我们的测试工具继续进程直到输入字符串为累。它在4和13找到另一场比赛。

The third example fails to find a match because the quantifier is possessive. In this case, the entire input string is consumed by .*+, leaving nothing left over to satisfy the "foo" at the end of the expression. Use a possessive quantifier for situations where you want to seize all of something without ever backing off; it will outperform the equivalent greedy quantifier in cases where the match is not immediately found.

第三个例子找不到匹配,因为量词是占有性的。在这种情况下,整个输入字符串被。* +消耗,不留下任何东西以满足表达式末尾的“foo”。使用占有量词来表示你想要抓住所有东西而不会退缩的情况;在没有立即找到匹配的情况下,它将胜过等效的贪心量词。

You can find this in the link http://docs.oracle.com/javase/tutorial/essential/regex/quant.html

您可以在http://docs.oracle.com/javase/tutorial/essential/regex/quant.html链接中找到它。

推荐阅读

search
Python爬虫中使用正则表达式的方法和注意事项

本文介绍了在Python爬虫中使用正则表达式的方法和注意事项。首先解释了爬虫的四个主要步骤，并强调了正则表达式在数据处理中的重要性。然后详细介绍了正则表达式的概念和用法，包括检索、替换和过滤文本的功能。同时提到了re模块是Python内置的用于处理正则表达式的模块，并给出了使用正则表达式时需要注意的特殊字符转义和原始字符串的用法。通过本文的学习，读者可以掌握在Python爬虫中使用正则表达式的技巧和方法。 ... [详细]

蜡笔小新 2023-12-12 11:51:07
search
Python正则表达式学习记录及常用方法

本文记录了学习Python正则表达式的过程，介绍了re模块的常用方法re.search，并解释了rawstring的作用。正则表达式是一种方便检查字符串匹配模式的工具，通过本文的学习可以掌握Python中使用正则表达式的基本方法。 ... [详细]

蜡笔小新 2023-12-13 16:37:19
main
Android开发实现的计时器功能示例

本文分享了Android开发实现的计时器功能示例，包括效果图、布局和按钮的使用。通过使用Chronometer控件，可以实现计时器功能。该示例适用于Android平台，供开发者参考。 ... [详细]

蜡笔小新 2023-12-12 22:51:19
filter
Python基础篇：315道题目及答案整理，帮助你检验学习成果

本文整理了315道Python基础题目及答案，帮助读者检验学习成果。文章介绍了学习Python的途径、Python与其他编程语言的对比、解释型和编译型编程语言的简述、Python解释器的种类和特点、位和字节的关系、以及至少5个PEP8规范。对于想要检验自己学习成果的读者，这些题目将是一个不错的选择。请注意，答案在视频中，本文不提供答案。 ... [详细]

蜡笔小新 2023-12-10 14:33:46
filter
阿里Treebased Deep Match(TDM) 学习笔记及技术发展回顾

本文介绍了阿里Treebased Deep Match(TDM)的学习笔记，同时回顾了工业界技术发展的几代演进。从基于统计的启发式规则方法到基于内积模型的向量检索方法，再到引入复杂深度学习模型的下一代匹配技术。文章详细解释了基于统计的启发式规则方法和基于内积模型的向量检索方法的原理和应用，并介绍了TDM的背景和优势。最后，文章提到了向量距离和基于向量聚类的索引结构对于加速匹配效率的作用。本文对于理解TDM的学习过程和了解匹配技术的发展具有重要意义。 ... [详细]

蜡笔小新 2023-12-14 19:24:58
main
在类中定义数组时出错 - Error on defining arrays in class

Iamtryingtomakeaclassthatwillreadatextfileofnamesintoanarray,thenreturnthatarra ... [详细]

蜡笔小新 2023-12-14 17:38:12
spring
Spring特性实现接口多类的动态调用详解

本文详细介绍了如何使用Spring特性实现接口多类的动态调用。通过对Spring IoC容器的基础类BeanFactory和ApplicationContext的介绍，以及getBeansOfType方法的应用，解决了在实际工作中遇到的接口及多个实现类的问题。同时，文章还提到了SPI使用的不便之处，并介绍了借助ApplicationContext实现需求的方法。阅读本文，你将了解到Spring特性的实现原理和实际应用方式。 ... [详细]

蜡笔小新 2023-12-14 03:24:19
post
的错误消息：

ZSI.generate.Wsdl2PythonError: unsupported local simpleType restriction ... [详细]

蜡笔小新 2023-12-13 20:28:08
eval
推荐系统遇上深度学习(十七）详解推荐系统中的常用评测指标

原创：石晓文小小挖掘机2018-06-18笔者是一个痴迷于挖掘数据中的价值的学习人，希望在平日的工作学习中，挖掘数据的价值， ... [详细]

蜡笔小新 2023-12-13 19:35:25
eval
如何通过全新应用内评价获取更多优质用户反馈？

Google Play推出全新的应用内评价API，帮助开发者获取更多优质用户反馈。用户每天在Google Play上发表数百万条评论，这有助于开发者了解用户喜好和改进需求。开发者可以选择在适当的时间请求用户撰写评论，以获得全面而有用的反馈。全新应用内评价功能让用户无需返回应用详情页面即可发表评论，提升用户体验。 ... [详细]

蜡笔小新 2023-12-13 17:23:03
post
自动轮播，反转播放的ViewPagerAdapter的使用方法和效果展示

本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter，并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ... [详细]

蜡笔小新 2023-12-13 14:41:31
main
Support Paged.JS for automatic hugo resume> PDF conversion.

FeatureRequestIsyourfeaturerequestrelatedtoaproblem?Please ... [详细]

蜡笔小新 2023-12-13 11:52:05
search
深度学习中的Vision Transformer (ViT)详解

本文详细介绍了深度学习中的Vision Transformer (ViT)方法。首先介绍了相关工作和ViT的基本原理，包括图像块嵌入、可学习的嵌入、位置嵌入和Transformer编码器等。接着讨论了ViT的张量维度变化、归纳偏置与混合架构、微调及更高分辨率等方面。最后给出了实验结果和相关代码的链接。本文的研究表明，对于CV任务，直接应用纯Transformer架构于图像块序列是可行的，无需依赖于卷积网络。 ... [详细]

蜡笔小新 2023-12-12 15:26:38
search
Python自动提取文本中的时间（包含中文日期）及特殊时间识别方法

本文介绍了在处理不规则数据时如何使用Python自动提取文本中的时间日期，包括使用dateutil.parser模块统一日期字符串格式和使用datefinder模块提取日期。同时，还介绍了一段使用正则表达式的代码，可以支持中文日期和一些特殊的时间识别，例如'2012年12月12日'、'3小时前'、'在2012/12/13哈哈'等。 ... [详细]

蜡笔小新 2023-12-12 12:09:33
search
恶意软件分析的最佳编程语言及其应用

本文介绍了学习恶意软件分析和逆向工程领域时最适合的编程语言，并重点讨论了Python的优点。Python是一种解释型、多用途的语言，具有可读性高、可快速开发、易于学习的特点。作者分享了在本地恶意软件分析中使用Python的经验，包括快速复制恶意软件组件以更好地理解其工作。此外，作者还提到了Python的跨平台优势，使得在不同操作系统上运行代码变得更加方便。 ... [详细]

蜡笔小新 2023-12-10 18:39:23

傻傻的笑没心没肺wy

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章