mercredi 13 juillet 2016
Python Regex to exclude the email domain and special character and extract @user in the twitter
I have a string twitter text as following:
str = "RT@aquage_7: 田@tianke おっ(´・ω・`) @_@, @__田科,
my email is tian@gmail.com, his@kate, I like @lucyさん,
and her email is kate@163.cn".
The regex pattern is:
p_name3 = re.compile(r'[@@]([a-zA-Z0-9_]{1,15})')
But the result is:
['aquage_7', 'tianke', '_', '__', 'gmail', 'kate', 'lucy', '163']
I hope the result is:
['aquage_7', 'tianke', '__', 'kate', 'lucy']
I mean I want to exclude the email domain name(please don't just focus on these two email domains) and special characters such as:
@_@, @____@.
In addition, you should know that the twitter user name include: a-zA-Z0-9_ and total character number is between 1 and 15. please give me your hand to solve this issue and trouble me for serval days. Thanks in advance.
Inscription à :
Publier les commentaires (Atom)
Aucun commentaire:
Enregistrer un commentaire