Regexp problem with `('

Paul McGuire ptmcg at austin.rr.com
Thu Mar 22 11:51:16 EDT 2007


On Mar 22, 3:26 am, "Johny" <pyt... at hope.cz> wrote:
> I have  the following text
>
> <title>Goods Item  146 (174459989)  - OurWebSite</title>
>
> from which I need to extract
> `Goods Item  146 '
>
> Can anyone help with regexp?
> Thank you for help
> L.
Here's the immediate answer to your question.


import re
src = "<title>Goods Item  146 (174459989)  - OurWebSite</title>"
pattern = r"<title>(.*)\("
re.search(pattern,src).groups()[0]


I post it this way so that you can relate the re to your specific
question, and then maybe apply this to whatever else you are scraping
from this web page.

Please don't follow up with a post asking how to extract "45","Rubber
chicken" from "<tr><td>45</td><td>Rubber chicken</td></tr>".  At this
point, you should try a little experimentation on your own.

-- Paul




More information about the Python-list mailing list