I did not understand the meaning of \n \r in
tweet2 = re.sub(r’https?://[^\s\n\r]+’, ‘’, tweet2)
what kind of special sequence they represent and
what is the purpose of using ? after https because I think https: can work also?
https?://
? refers to zero or one occurrence of the preceding character or group.
Ex. ba? will match b or ba
Here pattern https? will match http or https
[^\s\n\r]+
+ refers to one or more occurrence of the preceding character or group
[^\s\n\r] → Matches anything except \s \n \r (\s refers to whitespace characters, \n refers to newline, \r refers to return). Here, \n and \r seems unnecessary because \s already covers it.
[^\s\n\r]+ → It will match one or more character until any whitespace character is found (\n,\r,\t etc.)
3 Likes